Backup strategy for customer data?
-
When comparing tape, it's important to not look at raw capacity. LTO tape has hardware compression that is real time, on the fly and incredibly powerful. The compression ratios on tape are crazy. It's part of the sequential write mechanism. Hard drives don't offer this mechanism, nor could they because of the random access model. Tapes don't actually write raw, so an LTO8 is actually going to get 30TB on average. Sometimes less, sometimes more. But that's a real number to work with.
Hard drives are still struggling with getting 12TB and 14TB drives out, that's less than half of the capacity. At $270 for a Seagate (that most people don't consider safe... although a lot of that is just opinion) at 12TB that's $410 more for 24TB of capacity compared to $60 total for 30TB with tape. It takes very few tapes / hard drives to pay for the tape loader.
And as we showed in the other thread, a 12TB tape will be filled during a normal backup window, but a 12TB hard drive doesn't have enough time in a day to get written to. So isn't even a potential option for normal daily backups, even if it gets full performance to write at full speed all day, because it couldn't finish writing before the next day's backup would start (and that's for a single drive, let alone a set of them.)
-
Steven's article assumed that SSD would overcome spinners for backups. And that's definitely going to be true at some point in the future. But for now, per TB cost is still way higher for SSDs than spinners. Spinners won't even be on the market once SSDs are cheaper, and we don't expect that for a long time yet. So as long as tapes are cheaper than spinners, and spinners are cheaper than SSDs, SSDs rarely make good backup options.
Once they do, we'll see the market flooded with "removable" SSD options to replace tape. Right now hard drives and SSD both suffer from lacking good "removable" options that make the constant plugging and unplugging not take a terrible toll on the connectors.
To actually compare hard drives, you have to look at something like RDX. Hard drives for removable backup require the costly drives similar to what tapes need. So that cost isn't unique to tape. But the hard drive media is way more expensive for RDX than for non-removable hard drives. A 1TB RDX drive is $200 and a 4TB is $623. So removable hard drives aren't just more expensive per TB by a huge factor, but are much slower, too.
-
What are good options for "small" tapes setups, kinda like a 5, 6 or xx bay Synology for tape?
-
@FATeknollogee said in Backup strategy for customer data?:
What are good options for "small" tapes setups, kinda like a 5, 6 or xx bay Synology for tape?
Single drive, swap tapes. Anything small should fit on a single tape for a full backup.
There are tape libraries available where you can swap out more than a single tape, but those aren't needed till you are talking very large data sets.
-
@scottalanmiller said in Backup strategy for customer data?:
When comparing tape, it's important to not look at raw capacity. LTO tape has hardware compression that is real time, on the fly and incredibly powerful. The compression ratios on tape are crazy. It's part of the sequential write mechanism. Hard drives don't offer this mechanism, nor could they because of the random access model. Tapes don't actually write raw, so an LTO8 is actually going to get 30TB on average. Sometimes less, sometimes more. But that's a real number to work with.
Hard drives are still struggling with getting 12TB and 14TB drives out, that's less than half of the capacity. At $270 for a Seagate (that most people don't consider safe... although a lot of that is just opinion) at 12TB that's $410 more for 24TB of capacity compared to $60 total for 30TB with tape. It takes very few tapes / hard drives to pay for the tape loader.
And as we showed in the other thread, a 12TB tape will be filled during a normal backup window, but a 12TB hard drive doesn't have enough time in a day to get written to. So isn't even a potential option for normal daily backups, even if it gets full performance to write at full speed all day, because it couldn't finish writing before the next day's backup would start (and that's for a single drive, let alone a set of them.)
In many cases, the backup software compresses your backups first.
So it's important to look at how much raw data will get backed up, not just how much backed up data will make it onto tape... Because if you look at that, you'll find that you will only fit the raw uncompressed capacity value of the tape.
Archiving a backup to tape versus archiving raw data to tape.
In the end, it's about the same amount of raw data anyways.
-
@FATeknollogee said in Backup strategy for customer data?:
What are good options for "small" tapes setups, kinda like a 5, 6 or xx bay Synology for tape?
Pretty much you are stuck with LTO sizes. The primary purpose of tape is to physically disconnect it after use. So a single tape unit that someone pulls the tape from daily is the "small" setup. You can get external tape drives, or dedicated tape units. As you grow, you can move to robots and parallel write systems. Big enterprises basically all live and die by tapes and have giant robotic units that stream dozens of tapes at the same time and move them around in big libraries mechanically.
-
@Obsolesce said in Backup strategy for customer data?:
@scottalanmiller said in Backup strategy for customer data?:
When comparing tape, it's important to not look at raw capacity. LTO tape has hardware compression that is real time, on the fly and incredibly powerful. The compression ratios on tape are crazy. It's part of the sequential write mechanism. Hard drives don't offer this mechanism, nor could they because of the random access model. Tapes don't actually write raw, so an LTO8 is actually going to get 30TB on average. Sometimes less, sometimes more. But that's a real number to work with.
Hard drives are still struggling with getting 12TB and 14TB drives out, that's less than half of the capacity. At $270 for a Seagate (that most people don't consider safe... although a lot of that is just opinion) at 12TB that's $410 more for 24TB of capacity compared to $60 total for 30TB with tape. It takes very few tapes / hard drives to pay for the tape loader.
And as we showed in the other thread, a 12TB tape will be filled during a normal backup window, but a 12TB hard drive doesn't have enough time in a day to get written to. So isn't even a potential option for normal daily backups, even if it gets full performance to write at full speed all day, because it couldn't finish writing before the next day's backup would start (and that's for a single drive, let alone a set of them.)
In many cases, the backup software compresses your backups first.
So it's important to look at how much raw data will get backed up, not just how much backed up data will make it onto tape... Because if you look at that, you'll find that you will only fit the raw uncompressed capacity value of the tape.
Archiving a backup to tape versus archiving raw data to tape.
In the end, it's about the same amount of raw data anyways.
That's true, but tape compression ratios take that into account to some degree. LTO's streaming hardware compression is a bit different than compression used other places. Getting both types isn't bad. It will lower the ratio, but if you have no compression whatsoever you'll get more than average as well.
-
@scottalanmiller said in Backup strategy for customer data?:
@FATeknollogee said in Backup strategy for customer data?:
What are good options for "small" tapes setups, kinda like a 5, 6 or xx bay Synology for tape?
Pretty much you are stuck with LTO sizes. The primary purpose of tape is to physically disconnect it after use. So a single tape unit that someone pulls the tape from daily is the "small" setup. You can get external tape drives, or dedicated tape units. As you grow, you can move to robots and parallel write systems. Big enterprises basically all live and die by tapes and have giant robotic units that stream dozens of tapes at the same time and move them around in big libraries mechanically.
It seems like the next step up from a single tape is something like the 1U Dell PowerVault TL1000 which has a tray with 9 tapes. So you can backup and then swap out up to 9 tapes at the same time. That's roughly 100 to 350 TB per backup with LTO-8 tapes. Around $7500 without tapes.
-
We've worked with a variety of hosting solution providers. Most start with a base of one backup done per 24 hours with a fee to restore if required.
Some have a built-in backup feature that we can then set up for the VMs we have our cloud desktop clients running in. It can be set up to run relatively often. They charge a fee for that one.
Start with once per day.
As far as the "how" what is the underlying virtualization platform?
Our hosting solutions are set up to use Veeam at the host level.
StarWind's Virtual Tape Library (VTL) can be used to augment the backup in another DC with Veeam's Cloud Connect being another option to tie in to get the backup data out of the production DC.
As far as expectations go, we're in the process of setting up a BaaS and DRaaS service based on Veeam. Backups and DR will be multi-site with one goal to be a two to four week no-delete option available.
In our investigations of BaaS/DRaaS providers none were able, or wanted, to answer the, "How do you back up our backup data to protect against failures in your system?" question.
-
@PhlipElder said in Backup strategy for customer data?:
We've worked with a variety of hosting solution providers. Most start with a base of one backup done per 24 hours with a fee to restore if required.
Some have a built-in backup feature that we can then set up for the VMs we have our cloud desktop clients running in. It can be set up to run relatively often. They charge a fee for that one.
Start with once per day.
As far as the "how" what is the underlying virtualization platform?
Our hosting solutions are set up to use Veeam at the host level.
StarWind's Virtual Tape Library (VTL) can be used to augment the backup in another DC with Veeam's Cloud Connect being another option to tie in to get the backup data out of the production DC.
As far as expectations go, we're in the process of setting up a BaaS and DRaaS service based on Veeam. Backups and DR will be multi-site with one goal to be a two to four week no-delete option available.
In our investigations of BaaS/DRaaS providers none were able, or wanted, to answer the, "How do you back up our backup data to protect against failures in your system?" question.
As we are are getting into SaaS and not infrastructure, I think our primary concern are being able to restore the customers data in case something bad happens that's our fault or responsibility - for instance software bugs, hackers, ransomware, multiple hardware failures etc.
We are not as concerned with being able to restore the customers data in case they screw up, as we are if we screw up. That said, if we can without to much investment, we might be able to add something here. Have to think about that one. In either case we will provide some way for the customer to export and backup their data.
For now we run on xen (xcp-ng). The goal is to be able to restore the infrastructure with automation, so I don't expect us to really need a lot of host based backups. We have a lot more testing to do on this.
From what I can gather right now, I think we will backup to disk storage on-prem. Then from there we will go to tape. Tape will be moved off site once a week. We will do incremental backups to the cloud or another site so we can restore completely using off-site tape and the incremental backups.
This will allow us to restore from on-prem disk in most cases. If we are hacked or infected we can restore from on-site tape. In case of a fire or something we can restore from off site tape and incremental backups.
-
@Pete-S said in Backup strategy for customer data?:
@PhlipElder said in Backup strategy for customer data?:
We've worked with a variety of hosting solution providers. Most start with a base of one backup done per 24 hours with a fee to restore if required.
Some have a built-in backup feature that we can then set up for the VMs we have our cloud desktop clients running in. It can be set up to run relatively often. They charge a fee for that one.
Start with once per day.
As far as the "how" what is the underlying virtualization platform?
Our hosting solutions are set up to use Veeam at the host level.
StarWind's Virtual Tape Library (VTL) can be used to augment the backup in another DC with Veeam's Cloud Connect being another option to tie in to get the backup data out of the production DC.
As far as expectations go, we're in the process of setting up a BaaS and DRaaS service based on Veeam. Backups and DR will be multi-site with one goal to be a two to four week no-delete option available.
In our investigations of BaaS/DRaaS providers none were able, or wanted, to answer the, "How do you back up our backup data to protect against failures in your system?" question.
As we are are getting into SaaS and not infrastructure, I think our primary concern are being able to restore the customers data in case something bad happens that's our fault or responsibility - for instance software bugs, hackers, ransomware, multiple hardware failures etc.
We are not as concerned with being able to restore the customers data in case they screw up, as we are if we screw up. That said, if we can without to much investment, we might be able to add something here. Have to think about that one. In either case we will provide some way for the customer to export and backup their data.
For now we run on xen (xcp-ng). The goal is to be able to restore the infrastructure with automation, so I don't expect us to really need a lot of host based backups. We have a lot more testing to do on this.
From what I can gather right now, I think we will backup to disk storage on-prem. Then from there we will go to tape. Tape will be moved off site once a week. We will do incremental backups to the cloud or another site so we can restore completely using off-site tape and the incremental backups.
This will allow us to restore from on-prem disk in most cases. If we are hacked or infected we can restore from on-site tape. In case of a fire or something we can restore from off site tape and incremental backups.
There are some keys to providing a customer facing solution:
- Customer facing network(s) are not in any way connected to the hosting company's day to day network (DtDN)
- Privileged Access Workstation structures are in place to keep things separate
- Backups are air-gapped in some way to protect against catastrophic failure or encryption event
- Customer resources are on separate equipment from DtDN
Ultimately, the entire solution set for DtDN, Support, and Customer Facing networks should be segmented completely from each other with significant protections in place to keep them that way.
- iNSYNQ
- Wolters Kluwer/CCH
- Maersk
- PCM
- WiPro
- Hosting company (UK 123 something?) lost everything due to backups being wiped
- Secure mail hosting company lost everything when perp took everything out right through the backups
- ETC
-
@PhlipElder said in Backup strategy for customer data?:
hosting company's day to day network
With day to day network, do you mean the hosting company's own internal IT, for managing their own company?
Or do you mean the hosting company's management network for managing the hosting infrastructure? -
@Pete-S said in Backup strategy for customer data?:
@PhlipElder said in Backup strategy for customer data?:
hosting company's day to day network
With day to day network, do you mean the hosting company's own internal IT, for managing their own company?
Or do you mean the hosting company's management network for managing the hosting infrastructure?DtDN = Sales, HR, Financing, ETC where folks blindly click on things and get hit by drive-by web sites.
Management would be with PAW (Privileged Access Workstation) and segmented away from the DtDN with absolutely no crossover between them.
-
@scottalanmiller said in Backup strategy for customer data?:
When comparing tape, it's important to not look at raw capacity. LTO tape has hardware compression that is real time, on the fly and incredibly powerful. The compression ratios on tape are crazy. It's part of the sequential write mechanism. Hard drives don't offer this mechanism, nor could they because of the random access model. Tapes don't actually write raw, so an LTO8 is actually going to get 30TB on average. Sometimes less, sometimes more. But that's a real number to work with.
Question: Am I correct in assuming that this compression doesn't offer any benefit where the backup content is video media? If it DOES allow compression of video files, how good is the compression ratio?
-
@NashBrydges said in Backup strategy for customer data?:
Question: Am I correct in assuming that this compression doesn't offer any benefit where the backup content is video media? If it DOES allow compression of video files, how good is the compression ratio
That depends. But generally it does, but relatively little. You likely still want it on (especially on tape) because the compression mechanism normally speeds the writes to and from the media because it is compressed in real time. But heavily compressed video media is going to get very little additional compression, but generally some.
-
I did some comparisons of the cost involved for disk versus tape and disregarding the difference between the media types.
Tape is much cheaper per TB (about $11/TB) but you need to offset the cost of the tape drive/autoloader.
Disk on the other hand will require a more expensive server with more drive bays and also requires additional disks for partition data.In our case I found that at 150 TB of native storage it will break even. If you have more data in backup storage than that, then tape is cheaper.
-
In our case I'm thinking about two options.
OPTION 1
We'll put together a backup server with a large-ish disk array (maybe 100TB or so) connected with SAS to a tape autoloader. Backups go from backup clients to the disk array and when done it's all streamed to tape. The tapes are exchanged and put off-line. Each week a full backup of disks are taken off-site as well.To keep the networks separated as far as possible we can put the backup server on it's own hardware and it's own network and firewall it off from the production servers. So if production servers or VM hosts are breached the backup server is still intact. If somehow it's also compromised we have to restore everything from tape.
OPTION 2
We put a smaller backup array, say 10TB or so, on each physical VM host. Backups are run on each host from the production VMs to the backup VM with the backup array. Remember our VMs are running on local storage so this will not require any network traffic.When done, we stream the data from each backup VM to a "tape backup"-server that just basically contains the tape drive (with autoloader) and will write the data to tape. Firewall and tape handling will be the same as option 1. Since the disks with the backups are on each host, several backup servers have to be breached to lose all disk backups.
What do you think?
-
@Pete-S said in Backup strategy for customer data?:
What do you think?
I think you have done an awesome amount of research.
Why offsite disks if tape is already offsite? This seems like extra work that is not worth the cost of doing. Either way, when needing either these disks or the tapes, you are full restoring. I can't imagine that it would be a big enough different in restore times to matter in that scenario.
-
@Pete-S said in Backup strategy for customer data?:
What do you think?
The difference between options 1 and 2 seem to be two things to me.
- How much can be easily compromised at once
- Where the complexity of configuration is
Option 1 seems to be easier to compromise the entire setup, but is also easier to manage the configuration of the entire process.
Option 2 will be harder to compromise the entire setup, but is more complex to manage the entire setup.
-
@Pete-S said in Backup strategy for customer data?:
I did some comparisons of the cost involved for disk versus tape and disregarding the difference between the media types.
Tape is much cheaper per TB (about $11/TB) but you need to offset the cost of the tape drive/autoloader.
Disk on the other hand will require a more expensive server with more drive bays and also requires additional disks for partition data.In our case I found that at 150 TB of native storage it will break even. If you have more data in backup storage than that, then tape is cheaper.
How many tapes in the library?
How many briefcases to take off-premises for rotations?
Where is the brain trust to manage the tapes, their backup windows, and whether the correct tape set is in the drives?
If the tape libraries are elsewhere then the above goes away to some degree (distance comes into play).