Backup strategy for customer data?
-
We have VM hosts and VMs for test and development in our colo datacenter.
Eventually we will run real production and have customer data on our servers.We need to figure out how we should do backups and how much storage / hardware we are going to need.
But to do that we first have to figure out what the customers likely would expect from us when it comes to backup.Most of the customers data will be held in databases but some data will be files. Some of the data is generated automatically and some is manually entered into web applications.
If you were an enterprise customer or large SMB, what would you expect if you paid for a service (SaaS) that is hosted somewhere else?
-
@Pete-S said in Backup strategy for customer data?:
If you were an enterprise customer or large SMB, what would you expect if you paid for a service (SaaS) that is hosted somewhere else?
Good timing since there was that PerCSoft SaaS compromise this week where the backup mechanism is compromised.
At a minimum, I'd expect that backups are air gapped and no infection of the SaaS platform or customer system could flow back and also compromise the backups that have already been taken.
-
Backups are always tough to discuss because they tend to be so dramatically dependent on the data in question and how customers are able to protect their own data. In some cases, you'd pretty much naturally expect an ability to replay any transaction for years, in others you'd only expect to be able to recover if the system failed and nearly everything in between.
And then you have to consider archival data.
-
@scottalanmiller said in Backup strategy for customer data?:
Backups are always tough to discuss because they tend to be so dramatically dependent on the data in question and how customers are able to protect their own data. In some cases, you'd pretty much naturally expect an ability to replay any transaction for years, in others you'd only expect to be able to recover if the system failed and nearly everything in between.
And then you have to consider archival data.
I was thinking that we also need to provide a way for the customers to export and backup their own data as well. I would expect them to want that.
-
@Pete-S said in Backup strategy for customer data?:
@scottalanmiller said in Backup strategy for customer data?:
Backups are always tough to discuss because they tend to be so dramatically dependent on the data in question and how customers are able to protect their own data. In some cases, you'd pretty much naturally expect an ability to replay any transaction for years, in others you'd only expect to be able to recover if the system failed and nearly everything in between.
And then you have to consider archival data.
I was thinking that we also need to provide a way for the customers to export and backup their own data as well. I would expect them to want that.
That's often the case, and definitely a great feature.
-
@scottalanmiller said in Backup strategy for customer data?:
At a minimum, I'd expect that backups are air gapped and no infection of the SaaS platform or customer system could flow back and also compromise the backups that have already been taken.
How is that normally accomplished?
-
@Pete-S said in Backup strategy for customer data?:
@scottalanmiller said in Backup strategy for customer data?:
At a minimum, I'd expect that backups are air gapped and no infection of the SaaS platform or customer system could flow back and also compromise the backups that have already been taken.
How is that normally accomplished?
Tape
-
@scottalanmiller said in Backup strategy for customer data?:
@Pete-S said in Backup strategy for customer data?:
@scottalanmiller said in Backup strategy for customer data?:
At a minimum, I'd expect that backups are air gapped and no infection of the SaaS platform or customer system could flow back and also compromise the backups that have already been taken.
How is that normally accomplished?
Tape
Tape removed and taken off-site or just a backup to a tape in a tape loader?
-
@Pete-S said in Backup strategy for customer data?:
@scottalanmiller said in Backup strategy for customer data?:
@Pete-S said in Backup strategy for customer data?:
@scottalanmiller said in Backup strategy for customer data?:
At a minimum, I'd expect that backups are air gapped and no infection of the SaaS platform or customer system could flow back and also compromise the backups that have already been taken.
How is that normally accomplished?
Tape
Tape removed and taken off-site or just a backup to a tape in a tape loader?
Tape never stays in the loader Tape implies being removed. Maybe not taken off site, but definitely not left in the loader.
-
-
@Alex-Jones said in Backup strategy for customer data?:
@scottalanmiller said in Backup strategy for customer data?:
Tape
Why Tape?
Low cost, highly reliable, easily physically air gapped.
-
@Alex-Jones said in Backup strategy for customer data?:
@scottalanmiller said in Backup strategy for customer data?:
Tape
Why Tape?
Long discussion on it this week...
-
@scottalanmiller said in Backup strategy for customer data?:
@Alex-Jones said in Backup strategy for customer data?:
@scottalanmiller said in Backup strategy for customer data?:
Tape
Why Tape?
Low cost, highly reliable, easily physically air gapped.
https://blog.storagecraft.com/tape-backup-vs-hard-disk-backup-what-does-the-future-hold/
-
@Alex-Jones said in Backup strategy for customer data?:
@scottalanmiller said in Backup strategy for customer data?:
@Alex-Jones said in Backup strategy for customer data?:
@scottalanmiller said in Backup strategy for customer data?:
Tape
Why Tape?
Low cost, highly reliable, easily physically air gapped.
https://blog.storagecraft.com/tape-backup-vs-hard-disk-backup-what-does-the-future-hold/
The data there is quite old and hasn't played out as people might have expected. Tape capacities have grown much faster than those for hard drives, and it skips really, really important factors that apply to backups like portability, shelf stability, etc. That article talks about LTO5, but we are on LTO8 now. Speeds have improved and capacities have exploded. Hard disks have barely moved forward, though. So the tape vs hard drive comparison has moved heavily towards tape during that time.
But way, way more important is the rise of ransomware. The idea with hard drive backup medium was that you no longer needed air gapped backups. There was a time period when it was common to think that hard disk always-online backups were the future. But that future never materialized because of the risk involved with ransomware that has now become possibly the most important factor for needing backups.
Towards the end of the article you'll notice that he prices "several tapes" compared to a single hard drive. The article never considered the need to air gap backups at all, so was comparing apples to oranges. In the modern world, just using a single online hard drive would never be considered a valid backup mechanism. It doesn't meet the same need at all.
@Steven and I both wrote for that publication in that era, BTW
-
There are virtual tape mechanisms that allow you to write to object storage (on disk) and gets it to act like tape. But disk storage remains so expensive, that at scale it often doesn't make sense. Sometimes it does, but tape is really hard to beat. If you have itty bitty storage needs, then tape rarely makes sense. But once you get close to the size of a tape, it's effectively unbeatable. The transfer speeds are so fast, and capacity so large, and ability to be easily taken offline and storage for incredible long periods of time without big worries from bouncing around in transit or having temperature changes that would hurt hard drives.
Tape also doesn't use power when idle, whereas hard disks need constant power (and cooling) to stay healthy. Not that they use a lot, but it adds up when you have dozens or hundreds of hard drives spinning compared to tapes sitting on shelves.
-
When comparing tape, it's important to not look at raw capacity. LTO tape has hardware compression that is real time, on the fly and incredibly powerful. The compression ratios on tape are crazy. It's part of the sequential write mechanism. Hard drives don't offer this mechanism, nor could they because of the random access model. Tapes don't actually write raw, so an LTO8 is actually going to get 30TB on average. Sometimes less, sometimes more. But that's a real number to work with.
Hard drives are still struggling with getting 12TB and 14TB drives out, that's less than half of the capacity. At $270 for a Seagate (that most people don't consider safe... although a lot of that is just opinion) at 12TB that's $410 more for 24TB of capacity compared to $60 total for 30TB with tape. It takes very few tapes / hard drives to pay for the tape loader.
And as we showed in the other thread, a 12TB tape will be filled during a normal backup window, but a 12TB hard drive doesn't have enough time in a day to get written to. So isn't even a potential option for normal daily backups, even if it gets full performance to write at full speed all day, because it couldn't finish writing before the next day's backup would start (and that's for a single drive, let alone a set of them.)
-
Steven's article assumed that SSD would overcome spinners for backups. And that's definitely going to be true at some point in the future. But for now, per TB cost is still way higher for SSDs than spinners. Spinners won't even be on the market once SSDs are cheaper, and we don't expect that for a long time yet. So as long as tapes are cheaper than spinners, and spinners are cheaper than SSDs, SSDs rarely make good backup options.
Once they do, we'll see the market flooded with "removable" SSD options to replace tape. Right now hard drives and SSD both suffer from lacking good "removable" options that make the constant plugging and unplugging not take a terrible toll on the connectors.
To actually compare hard drives, you have to look at something like RDX. Hard drives for removable backup require the costly drives similar to what tapes need. So that cost isn't unique to tape. But the hard drive media is way more expensive for RDX than for non-removable hard drives. A 1TB RDX drive is $200 and a 4TB is $623. So removable hard drives aren't just more expensive per TB by a huge factor, but are much slower, too.
-
What are good options for "small" tapes setups, kinda like a 5, 6 or xx bay Synology for tape?
-
@FATeknollogee said in Backup strategy for customer data?:
What are good options for "small" tapes setups, kinda like a 5, 6 or xx bay Synology for tape?
Single drive, swap tapes. Anything small should fit on a single tape for a full backup.
There are tape libraries available where you can swap out more than a single tape, but those aren't needed till you are talking very large data sets.
-
@scottalanmiller said in Backup strategy for customer data?:
When comparing tape, it's important to not look at raw capacity. LTO tape has hardware compression that is real time, on the fly and incredibly powerful. The compression ratios on tape are crazy. It's part of the sequential write mechanism. Hard drives don't offer this mechanism, nor could they because of the random access model. Tapes don't actually write raw, so an LTO8 is actually going to get 30TB on average. Sometimes less, sometimes more. But that's a real number to work with.
Hard drives are still struggling with getting 12TB and 14TB drives out, that's less than half of the capacity. At $270 for a Seagate (that most people don't consider safe... although a lot of that is just opinion) at 12TB that's $410 more for 24TB of capacity compared to $60 total for 30TB with tape. It takes very few tapes / hard drives to pay for the tape loader.
And as we showed in the other thread, a 12TB tape will be filled during a normal backup window, but a 12TB hard drive doesn't have enough time in a day to get written to. So isn't even a potential option for normal daily backups, even if it gets full performance to write at full speed all day, because it couldn't finish writing before the next day's backup would start (and that's for a single drive, let alone a set of them.)
In many cases, the backup software compresses your backups first.
So it's important to look at how much raw data will get backed up, not just how much backed up data will make it onto tape... Because if you look at that, you'll find that you will only fit the raw uncompressed capacity value of the tape.
Archiving a backup to tape versus archiving raw data to tape.
In the end, it's about the same amount of raw data anyways.