So HA it is



  • So in having a discussion with my boss, a few things have been decided. We need the ability to have some level of HA. Likely in the same building. Specifically protecting from scenarios where a single host goes up in flames.

    So a single Hypervisor Host is off the table.

    The next is the choice of storage, consumer or enterprise grade SSD's. Having a much lower weekly delta change for our shares than the daily delta for the consumer grade SSDs (which the write delta mark for the consumer grade SSD's is 20GB per day), so we have to ask: "Are enterprise drives really worth over double the cost of each SSD?"

    For the added cost of the Enterprise SSD's we could could easily keep 2 full sets of backup drives at the ready. And then some.

    The next choice that must be made is what hypervisor are we going to use. XenServer or Hyper-V or lastly ESXi. As deciding this really refines our backup choices. Without this, two servers, and no backup system in place we might as well not have the server. So this is a critical choice.

    Besides the above choices, I've been asked to look back into Online Storage options to remove the need to take tapes or disks home weekly.

    Specifically to try and find some pricing for Windstream, Amazon and BackBlaze as the top 3 contenders.

    Even though we couldn't possibly push a full month's backup (~24TB [this would comprise 4 weeks of full backups]) offsite it might be viable for the incremental backups. Which is what I now need to look into, and our weekly delta is low enough that we need to weigh the options of taking tapes / disks home weekly with the cost to restore from an online storage provider.

    Thank you all for the help in help me get to this point in this project. It's been a true help.



  • Instead of looking at cloud storage have you looked at collocation space? I think I saw your other thread recently. That may be less expensive and you can put a massive storage system there. Set it up in RAID 6 and just have it as a backup repository. If something happens you could potentially drive to the colo site to get your backups.



  • I haven't looked into any providers, but have considered it. I doubt that we're willing to invest into another server, plus racking cost. I can look into it, do you have any recommendations?


  • Banned

    We have some Colo's at Time Warner Telecom, all Level 3 backbone connections. I'm pretty happy with it.



  • @DustinB3403 said:

    So in having a discussion with my boss, a few things have been decided. We need the ability to have some level of HA. Likely in the same building. Specifically protecting from scenarios where a single host goes up in flames.

    Was this a discussion involving numbers of how much the risk is versus the cost and value of mitigation? Or is it all emotionally driven, which SMBs typically run on emotion and it is what keeps them from becoming enterprises, in many cases. HA should never be a decision from IT, IT isn't the department with the skills or insight to know when HA is appropriate.



  • @DustinB3403 said:

    So a single Hypervisor Host is off the table.

    Most important rule of HA: HA is something that you do, not something that you buy.



  • WIth Hyper-V (and probably XenServer) you could use replication to the colocataed server and then local to the colocation backup if also desired.



  • I specifically spoke to the reasons why we likely don't need HA. But my boss made the decision, from emotion.



  • Key steps to HA involve redundant generators, good fuel supply plan, high availability and very intensive HVAC solutions, that kind of stuff. Your plant is 10 fold as important as your gear.



  • @DustinB3403 said:

    I haven't looked into any providers, but have considered it. I doubt that we're willing to invest into another server, plus racking cost. I can look into it, do you have any recommendations?

    No, sorry we don't have many collocation facilities around here (without driving 3-5 hours). This is something that would be local to you.

    I think the long term cost of a server and rackspace would be less then storing 24TB in the cloud (I'm not saying you will do that) you will need to do some cost analysis to see what is the better idea.

    You also need to remember the time it takes to retrieve data from the internet. "Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway."



  • @coliver Yes, thank you for the reminder.


  • Banned

    @coliver said:

    You also need to remember the time it takes to retrieve data from the internet. "Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway."

    If it was really important you could get ethernet to the colo and have your internet go out of the colo instead.



  • What's your bandwidth? Moving your entire infrastructure to an enterprise collocation site would probably be less expensive then building out a new server room with HVAC, generators, etc. @scottalanmiller beat me to it.



  • @coliver said:

    What's your bandwidth? Moving your entire infrastructure to an enterprise collocation site would probably be less expensive then building out a new server room with HVAC, generators, etc. @scottalanmiller beat me to it.

    Scott may have said something, but your suggestion puts it in black and white. If you're boss isn't willing to put this in a DC, it's probably not worth spending the money on HA.



  • I like the potentional not use a backup to get it offsite. I have not done the replication myself, but know another group that has. very little data replicating in each change.

    They brought in the server locally, seeded the initial replicas, moved it to the colocation facility, and then let it catch back up.



  • @JaredBusch said:

    I like the potentional not use a backup to get it offsite. I have not done the replication myself, but know another group that has. very little data replicating in each change.

    They brought in the server locally, seeded the initial replicas, moved it to the colocation facility, and then let it catch back up.

    I've never done replication over a high latency wire. Not sure how it would work.



  • @DustinB3403 said:

    The next is the choice of storage, consumer or enterprise grade SSD's. Having a much lower weekly delta change for our shares than the daily delta for the consumer grade SSDs (which the write delta mark for the consumer grade SSD's is 20GB per day), so we have to ask: "Are enterprise drives really worth over double the cost of each SSD?"

    Depends if they are part of the server support package or not, and if they will work with your controller or not. Often enterprise SSDs have special firmware to go with your hardware controller. The decision is holistic, not separated out to just consumer vs. enterprise.


  • Banned

    @JaredBusch said:

    I like the potentional not use a backup to get it offsite. I have not done the replication myself, but know another group that has. very little data replicating in each change.

    They brought in the server locally, seeded the initial replicas, moved it to the colocation facility, and then let it catch back up.

    We replicate ours every 5-10min depending on the server. So we see little traffic from this.. Most of the time.



  • @Dashrender said:

    Scott may have said something, but your suggestion puts it in black and white. If you're boss isn't willing to put this in a DC, it's probably not worth spending the money on HA.

    Or another way.... if your boss refuses to do HA, you can't do HA even if he requests HA 😉



  • @coliver said:

    @JaredBusch said:

    I like the potentional not use a backup to get it offsite. I have not done the replication myself, but know another group that has. very little data replicating in each change.

    They brought in the server locally, seeded the initial replicas, moved it to the colocation facility, and then let it catch back up.

    I've never done replication over a high latency wire. Not sure how it would work.

    Latency does not affect replication. That is the benefit of it.



  • @JaredBusch said:

    @coliver said:

    @JaredBusch said:

    I like the potentional not use a backup to get it offsite. I have not done the replication myself, but know another group that has. very little data replicating in each change.

    They brought in the server locally, seeded the initial replicas, moved it to the colocation facility, and then let it catch back up.

    I've never done replication over a high latency wire. Not sure how it would work.

    Latency does not affect replication. That is the benefit of it.

    Good to know.



  • @coliver said:

    I've never done replication over a high latency wire. Not sure how it would work.

    If it is async, hardly affects it at all. As long as you are replicating in the "minutes" range and not in the "seconds" range. High latency wire is normally no more than 300ms and 2,000ms tops.

    Full Sync is super latency sensitive because every write has to be confirmed before anything continues.



  • @coliver said:

    Good to know.

    It is basically just like transaction logging in SQL server. It writes the changes to a log file and then ships the log file. There is not a concern for latency. Obviously, you need to still have enough bandwidth for these changes. or you will always be getting farther behind, but because it is replicating, there is never a problem like a new full backup.



  • Another point I made is that if we really need HA between the host that we could simply increase our existing XenServer (which also answers several of the above questions) to support these future Virtual Servers and configure a single new Dell R720xd for fail over between the two.

    This idea was declined with "I'd rather leave that server for development VM's"

    So there is still some critical things that still need to be thought out.



  • @DustinB3403 said:

    The next choice that must be made is what hypervisor are we going to use. XenServer or Hyper-V or lastly ESXi. As deciding this really refines our backup choices.

    KVM would come in long before ESXi. ESXi would be like installing OpenVMS today. Just makes no sense on a new install. Costly and without benefits. Your budget doesn't allow it anyway.



  • Dustin, what does your company do?



  • @DustinB3403 said:

    Specifically to try and find some pricing for Windstream, Amazon and BackBlaze as the top 3 contenders.

    Windstream? Seriously? Why not just set the data on fire? That's not a business class company. They are infamous scammers and can't support their own links. Never do business with them, ever. They are so bad that they had to change their name to hide their bad reputation. As they are based around the corner from you, I'm shocked that anyone there would even allow their name to come up.


  • Banned

    @DustinB3403 said:

    Another point I made is that if we really need HA between the host that we could simply increase our existing XenServer (which also answers several of the above questions) to support these future Virtual Servers and configure a single new Dell R720xd for fail over between the two.

    This idea was declined with "I'd rather leave that server for development VM's"

    So there is still some critical things that still need to be thought out.

    Ask him why he wants a newer server for development stuff. You usually put your old crap for your labs.



  • @DustinB3403 said:

    Even though we couldn't possibly push a full month's backup (~24TB [this would comprise 4 weeks of full backups]) offsite it might be viable for the incremental backups. Which is what I now need to look into, and our weekly delta is low enough that we need to weigh the options of taking tapes / disks home weekly with the cost to restore from an online storage provider.

    Most backup products will support a direct connection to cloud hosted storage so you do a one time full and then it can do incrementals for forever or whatever so the traffic and total storage is not that outrageous. But you need to work with that through the backup product and not as a separate decision.



  • One of the biggest things here is that this decision, the plan, needs to be holistic. Which drives to use, which server(s) to buy, where to put them, having HA, the backup strategy, the fault tolerance strategy.... all of it is a single plan. It can't be pieced out as a bunch of separate pieces and then put together.


Log in to reply