Backups in the DevOps World

scottalanmiller

Traditionally, we've thought of performing disaster recovery as taking an image, or a near image, of the complete filesystem of a server that we need to restore and restoring it bit by bit to a new system. This could be done the very "old fashioned" way by having to install a stock operating system then restoring the "differences" from the backup system which is often quite a lot as all patches and changes since the system was first created are likely to be in the backup - this is generally a "file based" restore. The more recently approach of taking a full system image of a virtual machine (or less commonly of a physical server) and restoring an entire server - this is a "block based" restore.

In both of these cases we have two major underlying issues. One is that this is a restore operation is not a part of normal operations and so must be intentionally scheduled for testing to have any hope of having been tested. Nothing in necessary operations will test restores by default. The second is that the amount of data to be restored is very large and therefore will take time to get files back into place.

In the DevOps world, with software defined infrastructure, backups have an opportunity to be treated very differently. Building a new server for a task and restoring an existing one because two ways of describing the same operation. Building a server becomes a normal operational task that can easily happen all of the time. Suddenly the base system, including the entire operating system, applications, patches and so forth become moot as they are automatically created, as needed, via automation.

This, in fact, creates two classes of servers. Stateless and stateful. Normally a database or file server would be stateful - there is data on it that needs to be protected. Application servers would be stateless - there is nothing stored on them that is needed.

Restoring a stateless system requires no "restore" whatsoever - just a capacity expansion. A stateless system can be recreated, from scratch, via the build automation system. This might require manual intervention or a capacity detection system could decide automatically that capacity is too low and "restore" capacity in that manner. The idea of backups for stateless systems can go away completely.

Restoring a stateful system does require a backup of the data, but the majority of the system is still stateless. The OS and applications can still be built in an automated way and then only the data that should be store there needs to be pulled from a backup system. This would be the data in a database or the files on a file server, for example.

Consider how small the standard size of a database file is, especially when compressed, compared to the size of the full OS and applications. In the SMB, we'd typically see sizes much smaller than the OS, but not always, of course. A backup that the traditional way might require 40GB or more for a small system might be only a few GB for a database and zero for the application(s) that use it. An extremely busy web application of applications servers, database servers, proxies, load balancers and more that today might easily take several hundred GB to image might require only 800MB for a full backup in a DevOps mode.

DevOps thinking really changes backups more than most aspects of systems planning. Look around at your own environment. What if all of your systems could be restore at the press of a button to running state and only need their volatile data restored via a restoration process. How small might your backups be? How much faster and how much less costly and how much more reliable might they be?

BRRABill

Definitely an interesting concept.

Especially in the license-free world of Linux.

Much easier to separate data and applications into multiple servers when you can just stand up a new server when you want it.

scottalanmiller

@BRRABill said in Backups in the DevOps World:

Much easier to separate data and applications into multiple servers when you can just stand up a new server when you want it.

None, really. It's good practice to separate workloads, but not separating workloads doesn't cause storage to sprinkle throughout the OS in a way it does not when the workloads are separate. Workload divisions by VM would have no directly impact on DevOps backups.

So this applies absolutely equally regardless of OS or licensing.

BRRABill

@scottalanmiller said

So this applies absolutely equally regardless of OS or licensing.

I don't agree with that.

If that was the case, you'd probably never see a DC with anything else on it.

scottalanmiller

@BRRABill said in Backups in the DevOps World:

@scottalanmiller said

So this applies absolutely equally regardless of OS or licensing.

I don't agree with that.

I must be missing something. I can think of no factor that would apply.

scottalanmiller

@BRRABill said in Backups in the DevOps World:

If that was the case, you'd probably never see a DC with anything else on it.

I'm unclear what this means.

BRRABill

@scottalanmiller said in Backups in the DevOps World:

@BRRABill said in Backups in the DevOps World:

If that was the case, you'd probably never see a DC with anything else on it.

I'm unclear what this means.

My point is that we often see multiple things on a Windows DC. Best case is to have it by itself. Much easier to restore. AKA your point here. I'd venture to say a lot of this is a license issue.

No?

scottalanmiller

@BRRABill said in Backups in the DevOps World:

My point is that we often see multiple things on a Windows DC. Best case is to have it by itself. Much easier to restore. AKA your point here. I'd venture to say a lot of this is a license issue.

No?

But my point was seeing multiple things on one machine isn't a factor to the point at hand. And that having them separated doesn't change the discussion. You gave an example of a DC to counter that, but I don't understand what aspect of a DC you feel changes how the storage and system files are separated.

And given that I can't find any reason why systems being together in one VM image make a difference, that in turn means that the licensed makes no difference at all. So unless I can understand the former, I have no idea why the latter comes into play.

BRRABill

@scottalanmiller said in Backups in the DevOps World:

@BRRABill said in Backups in the DevOps World:

My point is that we often see multiple things on a Windows DC. Best case is to have it by itself. Much easier to restore. AKA your point here. I'd venture to say a lot of this is a license issue.

No?

But my point was seeing multiple things on one machine isn't a factor to the point at hand. And that having them separated doesn't change the discussion. You gave an example of a DC to counter that, but I don't understand what aspect of a DC you feel changes how the storage and system files are separated.

And given that I can't find any reason why systems being together in one VM image make a difference, that in turn means that the licensed makes no difference at all. So unless I can understand the former, I have no idea why the latter comes into play.

I am arguing that not having to worry about licensing makes it easier to best practice data and application loads if needed.

Want to blow away your malfunctioning Unifi controller and make a new one? Backup the config, make a new VM, reinstall, import config. Done. Not so easy with all sorts of stuff on a server.

The DC example is many times people back up the whole thing because there are applications on there, and DHCP, and data, and everything. How many times have we seen this on ML?

Best case is just having a DC. If it goes haywire? Don't restore. Set up a new one. Isn't that best practice?

scottalanmiller

@BRRABill said in Backups in the DevOps World:

I am arguing that not having to worry about licensing makes it easier to best practice data and application loads if needed.

Right, and I pointed out that this could not be the case. Does Windows licensing encourage bad practices as regards separating workloads? Yes. Does that have anything to do with the situation being discussed, no.

Why are you mentioning here, is where I am confused.

scottalanmiller

@BRRABill said in Backups in the DevOps World:

Want to blow away your malfunctioning Unifi controller and make a new one? Backup the config, make a new VM, reinstall, import config. Done. Not so easy with all sorts of stuff on a server.

Now you are talking about the granularity of the restore, not the separation of the system and the data. That's a great point and very valid, but not what we are discussing and doesn't change the volume of type of backups.

The one thing that it would do heavily, is destroy the idea of image based backups or "agentless" backups and make file backups and DevOps style backups far, far more important and they can restore "by service" rather than "by system."

scottalanmiller

@BRRABill said in Backups in the DevOps World:

The DC example is many times people back up the whole thing because there are applications on there, and DHCP, and data, and everything. How many times have we seen this on ML?

I'm missing the point. Lots of people don't use DevOps style backups today. Of course not. But they could be, and that's the point.

scottalanmiller

@BRRABill said in Backups in the DevOps World:

Best case is just having a DC. If it goes haywire? Don't restore. Set up a new one. Isn't that best practice?

No. Not a best practice. It's a good practice under certain conditions - conditions under which you would not be restoring from backup because the system is not down. You go to backups when the system is down. Your way only works when the cluster degraded but still functional.

scottalanmiller

DevOps style backups matter a bit because as we move to a world that wants offsite backups more and more the difference between trying to backup, or more importantly restore, 1TB or data or 10GB of data is huge. Not just in time, but in cost. Storing 10GB on Amazon S3 is trivial, a TB is far worse. And needing to download large traditional images means huge delays that might easily make restoring systems impractical, when pulling down a small database file that is compress might be a few minutes.

coliver

@BRRABill said in Backups in the DevOps World:

@scottalanmiller said

So this applies absolutely equally regardless of OS or licensing.

I don't agree with that.

If that was the case, you'd probably never see a DC with anything else on it.

You should literally never see a DC with anything else on it... that goes against best practices and Microsoft recommendations. If you're running Microsoft you are already buying into the costs. You know that it is going to cost money to run it well and correctly.

BRRABill

@coliver said in Backups in the DevOps World:

@BRRABill said in Backups in the DevOps World:

@scottalanmiller said

So this applies absolutely equally regardless of OS or licensing.

I don't agree with that.

If that was the case, you'd probably never see a DC with anything else on it.

You should literally never see a DC with anything else on it... that goes against best practices and Microsoft recommendations. If you're running Microsoft you are already buying into the costs. You know that it is going to cost money to run it well and correctly.

That was my point.

I spoke to @scottalanmiller offline about this yesterday. I think we were just arguing the wrong point.

My point was that it's much easier to just stand up a VM with individual stuff on it. It makes it easier to get it back up and running in the case of issues. His point was that has nothing to do with the data backup.

So, I think we were both right.

I agree on the DC, but how many times (I myself am guilty of this) do we see a DC with a bunch of other stuff on it? All the time. If Windows Server was free, that probably would be less of the case.

coliver

@BRRABill said in Backups in the DevOps World:

@coliver said in Backups in the DevOps World:

@BRRABill said in Backups in the DevOps World:

@scottalanmiller said

So this applies absolutely equally regardless of OS or licensing.

I don't agree with that.

If that was the case, you'd probably never see a DC with anything else on it.

You should literally never see a DC with anything else on it... that goes against best practices and Microsoft recommendations. If you're running Microsoft you are already buying into the costs. You know that it is going to cost money to run it well and correctly.

That was my point.

I spoke to @scottalanmiller offline about this yesterday. I think we were just arguing the wrong point.

My point was that it's much easier to just stand up a VM with individual stuff on it. It makes it easier to get it back up and running in the case of issues. His point was that has nothing to do with the data backup.

So, I think we were both right.

I agree on the DC, but how many times (I myself am guilty of this) do we see a DC with a bunch of other stuff on it? All the time. If Windows Server was free, that probably would be less of the case.

That's true, but we already have a FOSS option for a majority of the tools Windows provides. If people were going to follow best practices then the licensing part would never come into it.

BRRABill

@coliver said

That's true, but we already have a FOSS option for a majority of the tools Windows provides. If people were going to follow best practices then the licensing part would never come into it.

If you are going that direction, there is probably FOSS for everything windows provides in most circumstances, no?

scottalanmiller

@coliver said in Backups in the DevOps World:

@BRRABill said in Backups in the DevOps World:

@scottalanmiller said

So this applies absolutely equally regardless of OS or licensing.

I don't agree with that.

If that was the case, you'd probably never see a DC with anything else on it.

You should literally never see a DC with anything else on it... that goes against best practices and Microsoft recommendations. If you're running Microsoft you are already buying into the costs. You know that it is going to cost money to run it well and correctly.

That's a very important thing to remember. Windows is an awesome product, but it is a cost premium and if you are considering it, then licensing should be factored into the consideration. If you can't afford to run it, you shouldn't run it, it's that easy.

coliver

@BRRABill said in Backups in the DevOps World:

@coliver said

That's true, but we already have a FOSS option for a majority of the tools Windows provides. If people were going to follow best practices then the licensing part would never come into it.

If you are going that direction, there is probably FOSS for everything windows provides in most circumstances, no?

Yes, and that brings up the point of the other topic, what value does MS actually bring to the table. Other then making it easier for IT Pros to mess up best practices.