KVM and Back Ups
-
@DustinB3403 said in KVM and Back Ups:
To answer this, what I did for my lab is setup UrBackup and just installed the agent into each of my VM's.
It works and is simple enough to restore from.
Right, that's often the better option. It's safer than hypervisor level backups, and if set up well, is just as fast and more flexible.
-
@DustinB3403 said in KVM and Back Ups:
To answer this, what I did for my lab is setup UrBackup and just installed the agent into each of my VM's.
It works and is simple enough to restore from.
I have to check that out tonight in my lab
-
@fuznutz04 said in KVM and Back Ups:
@scottalanmiller said in KVM and Back Ups:
@fuznutz04 said in KVM and Back Ups:
@scottalanmiller said in KVM and Back Ups:
@fuznutz04 said in KVM and Back Ups:
For example, I had a developer fubar a server the other day. Completely unrecoverable. It was hosted at vultr, and I used their backup service. I was able to completely restore the server from their snapshot backup. That’s what I am after.
That's not crash consistent. So THAT level of backup KVM can do without anything special, it's just taking a snapshot of the storage. You have that with any system because it is done at the storage layer.
What tools can I use to do that (scheduled) with KVM on fedora?
If you want the Vultr style (or ProxMox risky style), you can do that right from the storage layer. So first determine the storage that you are going to use. ZFS, BtrFS, XFS, LVM, etc. Then you use the native tools (if you want) to snap it. Everything except the scheduling is just built in.
What is the latest recommendation for storage now? LVM?
LVM, ZFS, BtrFS are all fine. I've not used this but here is a script to do LVM backups...
-
@scottalanmiller said in KVM and Back Ups:
@fuznutz04 said in KVM and Back Ups:
@scottalanmiller said in KVM and Back Ups:
@fuznutz04 said in KVM and Back Ups:
@scottalanmiller said in KVM and Back Ups:
@fuznutz04 said in KVM and Back Ups:
For example, I had a developer fubar a server the other day. Completely unrecoverable. It was hosted at vultr, and I used their backup service. I was able to completely restore the server from their snapshot backup. That’s what I am after.
That's not crash consistent. So THAT level of backup KVM can do without anything special, it's just taking a snapshot of the storage. You have that with any system because it is done at the storage layer.
What tools can I use to do that (scheduled) with KVM on fedora?
If you want the Vultr style (or ProxMox risky style), you can do that right from the storage layer. So first determine the storage that you are going to use. ZFS, BtrFS, XFS, LVM, etc. Then you use the native tools (if you want) to snap it. Everything except the scheduling is just built in.
What is the latest recommendation for storage now? LVM?
LVM, ZFS, BtrFS are all fine. I've not used this but here is a script to do LVM backups...
Awesome. I’ve used ZFS in the past, but it was on a freeNas box I was testing. Seemed pretty good at the time (zfs)
-
@fuznutz04 said in KVM and Back Ups:
@scottalanmiller said in KVM and Back Ups:
@fuznutz04 said in KVM and Back Ups:
For example, I had a developer fubar a server the other day. Completely unrecoverable. It was hosted at vultr, and I used their backup service. I was able to completely restore the server from their snapshot backup. That’s what I am after.
That's not crash consistent. So THAT level of backup KVM can do without anything special, it's just taking a snapshot of the storage. You have that with any system because it is done at the storage layer.
What tools can I use to do that (scheduled) with KVM on fedora?
QEMU has both internal and external snapshots. Internal are inside of the qcow2 file, external are redirect on write snapshots. The external are the more robust since they don't do full COW like the internal ones.
-
@stacksofplates said in KVM and Back Ups:
@fuznutz04 said in KVM and Back Ups:
@scottalanmiller said in KVM and Back Ups:
@fuznutz04 said in KVM and Back Ups:
For example, I had a developer fubar a server the other day. Completely unrecoverable. It was hosted at vultr, and I used their backup service. I was able to completely restore the server from their snapshot backup. That’s what I am after.
That's not crash consistent. So THAT level of backup KVM can do without anything special, it's just taking a snapshot of the storage. You have that with any system because it is done at the storage layer.
What tools can I use to do that (scheduled) with KVM on fedora?
QEMU has both internal and external snapshots. Internal are inside of the qcow2 file, external are redirect on write snapshots. The external are the more robust since they don't do full COW like the internal ones.
-
@scottalanmiller said in KVM and Back Ups:
@stacksofplates said in KVM and Back Ups:
@fuznutz04 said in KVM and Back Ups:
@scottalanmiller said in KVM and Back Ups:
@fuznutz04 said in KVM and Back Ups:
For example, I had a developer fubar a server the other day. Completely unrecoverable. It was hosted at vultr, and I used their backup service. I was able to completely restore the server from their snapshot backup. That’s what I am after.
That's not crash consistent. So THAT level of backup KVM can do without anything special, it's just taking a snapshot of the storage. You have that with any system because it is done at the storage layer.
What tools can I use to do that (scheduled) with KVM on fedora?
QEMU has both internal and external snapshots. Internal are inside of the qcow2 file, external are redirect on write snapshots. The external are the more robust since they don't do full COW like the internal ones.
I've got one in here somewhere also. You just put in the location you want the backup sent to and it copies the snapshot there.
-
@stacksofplates
Lots of testing to do later tonight. -
@scottalanmiller said in KVM and Back Ups:
@fuznutz04 said in KVM and Back Ups:
So in this case, I’d have a PBX, a Wordpress site and eventually some windows server workloads. All of them are individually backed up via scripts at the OS level.
That's all that you want. Just the OS level backups.
No, that is all you want. The rest of us want VM level backups.
-
@JaredBusch said in KVM and Back Ups:
@scottalanmiller said in KVM and Back Ups:
@fuznutz04 said in KVM and Back Ups:
So in this case, I’d have a PBX, a Wordpress site and eventually some windows server workloads. All of them are individually backed up via scripts at the OS level.
That's all that you want. Just the OS level backups.
No, that is all you want. The rest of us want VM level backups.
I want what is reliable and good for restoring the environment, not what's pushed by marketing companies.
I'm focused on the goal: working backups. Or even better: Disaster protection.
I'm not being distracted by the means. When anyone in IT talks about wanting a hypervisor level backup, that's a "means" distraction caused by forgetting to stay goal focused.
-
I think it's wise to consider what kind of failure you are trying to protect yourself from and how you are going to recover.
If I want to restore a VM that doesn't work as it should or the host crashed, I'd want a VM backup, taken in a known good state, because that is the fastest way to get something working again, perhaps on another host. To me this is an infrastructure backup. Our infrastructure is broken and we need to recover from that. Which also means we need a backup of the VM host of course, and everything else that could fail, including documentation and procedures how to get it back up.
If a user deleted some files and want them back, I'd want a file level backup. That to me is a backup of business data, not infrastructure, and I'd want that backed up on a schedule that fits the data.
And if I have to restore a database or a table within the database, I'd want a consistent database backup, not the database files and absolutely not the VM level backup. This to me is a different kind of business data.
-
@fuznutz04 said in KVM and Back Ups:
@black3dynamite said in KVM and Back Ups:
Proxmox backups are always a full backup.
https://pve.proxmox.com/wiki/Backup_and_RestoreDo you use or have any experience using proxmox? Does/can it just run as a VM on the host?
I haven’t use it since version 4. And then off and on I set up a lab just to see how’s it progressing. I’ve installed it as an VM but that’s about it.
-
So this one is running right now. So far, looks like it is working fine. Will test restores after.
# Set the language to English so virsh does it's output # in English as well # LANG=en_US # Define the script name, this is used with systemd-cat to # identify this script in the journald output SCRIPTNAME=kvm-backup # List domains DOMAINS=$(virsh list | tail -n +3 | awk '{print $2}') # Loop over the domains found above and do the # actual backup for DOMAIN in $DOMAINS; do echo "Starting backup for $DOMAIN on $(date +'%d-%m-%Y %H:%M:%S')" | systemd-cat -t $SCRIPTNAME # Generate the backup folder URI - this is something you should # change/check BACKUPFOLDER=/mnt/backups/$DOMAIN/$(date +%d-%m-%Y) mkdir -p $BACKUPFOLDER # Get the target disk TARGETS=$(virsh domblklist $DOMAIN --details | grep disk | awk '{print $3}') # Get the image page IMAGES=$(virsh domblklist $DOMAIN --details | grep disk | awk '{print $4}') # Create the snapshot/disk specification DISKSPEC="" for TARGET in $TARGETS; do DISKSPEC="$DISKSPEC --diskspec $TARGET,snapshot=external" done virsh snapshot-create-as --domain $DOMAIN --name "backup-$DOMAIN" --no-metadata --atomic --disk-only $DISKSPEC 1>/dev/null 2>&1 if [ $? -ne 0 ]; then echo "Failed to create snapshot for $DOMAIN" | systemd-cat -t $SCRIPTNAME exit 1 fi # Copy disk image for IMAGE in $IMAGES; do NAME=$(basename $IMAGE) # cp $IMAGE $BACKUPFOLDER/$NAME # pv $IMAGE > $BACKUPFOLDER/$NAME rsync -ah --progress $IMAGE $BACKUPFOLDER/$NAME done # Merge changes back BACKUPIMAGES=$(virsh domblklist $DOMAIN --details | grep disk | awk '{print $4}') for TARGET in $TARGETS; do virsh blockcommit $DOMAIN $TARGET --active --pivot 1>/dev/null 2>&1 if [ $? -ne 0 ]; then echo "Could not merge changes for disk of $TARGET of $DOMAIN. VM may be in invalid state." | systemd-cat -t $SCRIPTNAME exit 1 fi done # Cleanup left over backups for BACKUP in $BACKUPIMAGES; do rm -f $BACKUP done # Dump the configuration information. virsh dumpxml $DOMAIN > $BACKUPFOLDER/$DOMAIN.xml 1>/dev/null 2>&1 echo "Finished backup of $DOMAIN at $(date +'%d-%m-%Y %H:%M:%S')" | systemd-cat -t $SCRIPTNAME done exit
-
@Pete-S said in KVM and Back Ups:
If a user deleted some files and want them back, I'd want a file level backup. That to me is a backup of business data, not infrastructure, and I'd want that backed up on a schedule that fits the data.
Any kind of backup might allow for that. Doesn't require it to be file level. Veeam, for example, will take a system image, but restore just a file.
-
@Pete-S said in KVM and Back Ups:
If I want to restore a VM that doesn't work as it should or the host crashed, I'd want a VM backup, taken in a known good state, because that is the fastest way to get something working again, perhaps on another host. To me this is an infrastructure backup. Our infrastructure is broken and we need to recover from that. Which also means we need a backup of the VM host of course, and everything else that could fail, including documentation and procedures how to get it back up.
That's one approach, but you can also often do a fresh build roughly as fast, if your system is designed well. You don't need an image of the whole thing. Also, image backups are risky and require you to normally have something else as the "real" backup. So you often take two backups (or more) instead of one, and if you restore from it, you risk that your restore is bad and you have to do it again using another method. Rather than one method that gives you reliable backups AND rapid recovery.
-
@Pete-S said in KVM and Back Ups:
And if I have to restore a database or a table within the database, I'd want a consistent database backup, not the database files and absolutely not the VM level backup. This to me is a different kind of business data.
Kind of all the same thing, just the chances of files being corrupted is different. It's the "risk level". If you take what I call devops style backups, you get everything covered in a single method. If you do anything else, you have to have multiple backups to address each recovery case.
-
@fuznutz04 said in KVM and Back Ups:
So this one is running right now. So far, looks like it is working fine. Will test restores after.
# Set the language to English so virsh does it's output # in English as well # LANG=en_US # Define the script name, this is used with systemd-cat to # identify this script in the journald output SCRIPTNAME=kvm-backup # List domains DOMAINS=$(virsh list | tail -n +3 | awk '{print $2}') # Loop over the domains found above and do the # actual backup for DOMAIN in $DOMAINS; do echo "Starting backup for $DOMAIN on $(date +'%d-%m-%Y %H:%M:%S')" | systemd-cat -t $SCRIPTNAME # Generate the backup folder URI - this is something you should # change/check BACKUPFOLDER=/mnt/backups/$DOMAIN/$(date +%d-%m-%Y) mkdir -p $BACKUPFOLDER # Get the target disk TARGETS=$(virsh domblklist $DOMAIN --details | grep disk | awk '{print $3}') # Get the image page IMAGES=$(virsh domblklist $DOMAIN --details | grep disk | awk '{print $4}') # Create the snapshot/disk specification DISKSPEC="" for TARGET in $TARGETS; do DISKSPEC="$DISKSPEC --diskspec $TARGET,snapshot=external" done virsh snapshot-create-as --domain $DOMAIN --name "backup-$DOMAIN" --no-metadata --atomic --disk-only $DISKSPEC 1>/dev/null 2>&1 if [ $? -ne 0 ]; then echo "Failed to create snapshot for $DOMAIN" | systemd-cat -t $SCRIPTNAME exit 1 fi # Copy disk image for IMAGE in $IMAGES; do NAME=$(basename $IMAGE) # cp $IMAGE $BACKUPFOLDER/$NAME # pv $IMAGE > $BACKUPFOLDER/$NAME rsync -ah --progress $IMAGE $BACKUPFOLDER/$NAME done # Merge changes back BACKUPIMAGES=$(virsh domblklist $DOMAIN --details | grep disk | awk '{print $4}') for TARGET in $TARGETS; do virsh blockcommit $DOMAIN $TARGET --active --pivot 1>/dev/null 2>&1 if [ $? -ne 0 ]; then echo "Could not merge changes for disk of $TARGET of $DOMAIN. VM may be in invalid state." | systemd-cat -t $SCRIPTNAME exit 1 fi done # Cleanup left over backups for BACKUP in $BACKUPIMAGES; do rm -f $BACKUP done # Dump the configuration information. virsh dumpxml $DOMAIN > $BACKUPFOLDER/$DOMAIN.xml 1>/dev/null 2>&1 echo "Finished backup of $DOMAIN at $(date +'%d-%m-%Y %H:%M:%S')" | systemd-cat -t $SCRIPTNAME done exit
Remember, testing restores from "tests" is rarely similar to restoring from catastrophic failure. In a test, almost any method appears to restore reliably, even those we know are not reliable.
-
@dafyre said in KVM and Back Ups:
In my experience with it, it has often corrupted randomly and to the point that it's own snapshots are no help, nor are VMware Snapshots.
How could it correct VMware snapshots?
-
@dafyre said in KVM and Back Ups:
@scottalanmiller said in KVM and Back Ups:
@dafyre said in KVM and Back Ups:
In my experience with it, it has often corrupted randomly and to the point that it's own snapshots are no help, nor are VMware Snapshots.
How could it correct VMware snapshots?
I guess it's more that BtrFS doesn't detect the corruption early enough and our VMware snapshot are nothing but snapshots of corrupt data... That's about the only way I can explain it.
General risk with hypervisor level backups. This is a huge reason for either local file based or what I call devops backups. They are at a higher level, so there is way more opportunity for this.
But if the system was okay when you took the VMware snap, it should have been okay when you restored it. Regardless of corruption.
-
@scottalanmiller said in KVM and Back Ups:
@dafyre said in KVM and Back Ups:
@scottalanmiller said in KVM and Back Ups:
@dafyre said in KVM and Back Ups:
In my experience with it, it has often corrupted randomly and to the point that it's own snapshots are no help, nor are VMware Snapshots.
How could it correct VMware snapshots?
I guess it's more that BtrFS doesn't detect the corruption early enough and our VMware snapshot are nothing but snapshots of corrupt data... That's about the only way I can explain it.
General risk with hypervisor level backups. This is a huge reason for either local file based or what I call devops backups. They are at a higher level, so there is way more opportunity for this.
But if the system was okay when you took the VMware snap, it should have been okay when you restored it. Regardless of corruption.
Yeah, exactly.... and this is why Snapshots are not a backup!