Database Corruption Nightmare
-
VMware 6 along with Veeam has wreaked total havoc on our network. I wanted to share this email I received with everyone. We are experiencing corruption with Excahnge, AD, Oracle, and SQL databases all due to VMware 6 and Veeam backups. In short our backup system which is suppposed to give us reassurance has screwed us over. I have been working with Microsoft every single night over the past few days to try to resolve this issue.
I would highly recommend avoiding the combination of VMware 6 and Veeam
Email below:
All, I have become aware of a very serious flaw in Vmware 6.
I know you experienced an issue with your Exchange database having some corruption a couple weeks ago.
I have been dealing with another customer that recently upgraded to Vsphere 6 that is experiencing Exchange and Active Directory database corruption.
We had Vmware, Microsoft, and Dell Equallogic looking into the issue. No one could explain the reason what was going on.
I happened to stumble on a Veeam forum where all the users were complaining of database corruption (exchange, AD, SQL, Oracle, Etc) when they recently upgraded to vSphere 6.
The corruption seemed to happen after a backup using Veeam.Apparently there is a bug in how vSphere 6 handles Change-Block Tracking when performing snapshots, such as what is used in Veeam Backups (and also most other Backup software providers, Symantec Backup Exec, Barracuda, etc).
But there is an ESXi patch available to fix the issue.
So we need to get the patch installed ASAP. I am going to start working on that now. I do not want you all to experience what my other customer is going through right now.Here is the link to the Forum where the talk about the issue and the resolution.
-
Is there anything with Veeam possible here? Sounds like VMware is simply not stable. Given that Veeam is external and is not installed on ESXi, Veeam cannot be at fault here, right? If you were to do the backups using any tool that leverages VMware's own tools this could happen?
-
So you are saying not to use Veeam and VMware 6? When all backup applications (as the email you mentioned said) have the same issue. It seems like the proper response would be not to use VMware 6.
-
I mean.... it seems like if you had taken a snapshot manually this would have happened too, right? Veeam is only guilty of trusting VMware to do its job, which is how any backup software of that type has to behave and is based on the same assumptions that makes people choose VMware at all?
-
@coliver said:
So you are saying not to use Veeam and VMware 6? When all backup applications (as the email you mentioned said) have the same issue. It seems like the proper response would be not to use VMware 6.
Which I've been recommending against in general (VMware, not VMware 6) for a month or two now. VMware just doesn't have the SMB value that it used to have. Although this is an issue far and away more serious than "lacking compelling value."
-
I wonder if the same thing happens when you use the database's built in backup tools. Or if this is simply a platform related issue.
-
@scottalanmiller said:
@coliver said:
So you are saying not to use Veeam and VMware 6? When all backup applications (as the email you mentioned said) have the same issue. It seems like the proper response would be not to use VMware 6.
Which I've been recommending against in general (VMware, not VMware 6) for a month or two now. VMware just doesn't have the SMB value that it used to have. Although this is an issue far and away more serious than "lacking compelling value."
No argument there... although there is still a stigma against non-vmware hypervisors for some reason.
-
@coliver said:
I wonder if the same thing happens when you use the database's built in backup tools. Or if this is simply a platform related issue.
It would not, at least not corrupt the database during the database only backup. Database backup tools don't act on the filesystem. The issue with the VMware built in backup is that it does something to the block device on which things run. Normally this is transparent but obviously something is wrong.
Now if you took an Exchange level backup, for example, THAT backup should be stable, but the system level backup of the whole box would still corrupt the running databases even if you have good backups of them individually. Of course, at least then you would have backups to fall back to, which is a bonus.
-
@coliver said:
No argument there... although there is still a stigma against non-vmware hypervisors for some reason.
Marketing. It's simply mind blowing how much marketing changes perception. Even ten years ago we had issues with VMware that Xen didn't have.
-
IMO both vendors are at fault. Someone at Veeam had to do some testing before they approved compatibility with VmWare 6. At this point who's fault it is doesn't really matter. Serious issues were caused by using a combination of Veeam and VMware 6.
-
@IRJ said:
IMO both vendors are at fault. Someone at Veeam had to do some testing before they approved compatibility with VmWare 6. At this point who's fault it is doesn't really matter. Serious issues were caused by using a combination of Veeam and VMware 6.
Doesn't sound like there is a compatibility problem, sounds like VMware simply doesn't work. Veeam doesn't appear to have missed anything. And you know that some versions of VMware 6 do work, right? Veeam cannot be responsible for every release that VMware puts out.
In the same vein, if Veeam is at fault, so is every backup vendor and company that deployes VMware with this issue. Veeam can't control which versions and patches customers release. If anyone isn't at fault, isn't it Veeam?
It's not the combination that caused the issue from your description, it was just VMware failing. Just that, any use of the platform involving the snapshot feature could have done this, right?
So Veeam, if that description is correct, can not be accountable for this. The only option you would give Veeam is to not do business because their customers might choose hypervisors that are not reliable! You are blaming the innocent and giving them no acceptable means of not being blamed.
-
I'll admit when I am wrong. Veeam warned everyone about this in March
http://www.veeam.com/blog/vmware-vsphere-6-support-coming-soon.html
-
@IRJ said:
Serious issues were caused by using a combination of Veeam and VMware 6.
Serious issues were caused by using VMware. That Veeam was used is a red herring. It's VMware itself that was the problem.
Yes, in this situation Veeam's use exposed the existing issue. But Veeam is not at all what is causing it. It's only incidentally related. Any backup software for VMware or even just normal admin functions would have and will cause the same issues.
-
@IRJ said:
IMO both vendors are at fault. Someone at Veeam had to do some testing before they approved compatibility with VmWare 6. At this point who's fault it is doesn't really matter. Serious issues were caused by using a combination of Veeam and VMware 6.
I'll agree that Veeam has fault if they listed VMware 6 as a certified platform, but as mentioned this is totally a VMware problem, as most likely you'll have the same corruption problem if you just kick off a manual snapshot.
Assuming you, @IRJ, don't use Veeam, there's not reason for you to be mad/upset at them.
-
It's a tough position for Veeam, they have to make the products that their customers demand. They can't just not release products for VMware, they'd be out of business. They are caught making a great backup product for a hypervisor that they don't control. Think of all the vendors making software for Windows, if Windows has a flaw they are stuck with the stability of the platform that they are built on.
-
@scottalanmiller said:
It's a tough position for Veeam, they have to make the products that their customers demand. They can't just not release products for VMware, they'd be out of business. They are caught making a great backup product for a hypervisor that they don't control. Think of all the vendors making software for Windows, if Windows has a flaw they are stuck with the stability of the platform that they are built on.
I agree. I love being able to restore an entire VM in less than 15 minutes. I thought Veeam was the greatest thing since sliced bread.
-
I agree they have to make products for the platforms they exist on, but I agree with @IRJ I can't believe they did any real testing and didn't find this problem before they 'Certified' their software to run on it.
If they didn't ever list that they support VMware 6, then nevermind because they didn't tell customers they were ready for VMware 6 - AKA stay on VMWare 5.5 etc until they do.
-
@IRJ said:
I agree. I love being able to restore an entire VM in less than 15 minutes. I thought Veeam was the greatest thing since sliced bread.
It is! But it needs a reliable platform to work with. It works with HyperV too... which is free!
-
I heard about a problem with VMware related to the CBD (Change Based Tracking)...I did a quick google and found the relevant info...
The problem lies not with Veeam in this instance, but a bug in VMware's VDDK... see:
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2090639My Google Search: https://www.google.com/search?q=VMWare+CBT+Bug&gws_rd=ssl
1st link goes to the Veeam Forums, second one goes to the KB article I linked above, and the 5th ones goes to Symantec.
-
@Dashrender said:
I agree they have to make products for the platforms they exist on, but I agree with @IRJ I can't believe they did any real testing and didn't find this problem before they 'Certified' their software to run on it.
Why? Here are some reasons:
- Obviously VMware and Veeam did test and this wasn't happening. Are you blaming Veeam for not testing but ignoring the testing that VMware obviously did? Clearly there was testing, it's silly to think that there was not.
- Many vendors, not just these, tested without finding these problems.
- We've already seen that VMware 6 is stable with the right patch, so none of the above even matters. VMware 6 isn't the issue but a specific patch level(s) of it.
- Veeam didn't fail, why would they not certify? That's silly. Veeam should not refuse to release products because their customers chose an unstable platform.
I think that any blaming of Veeam here is wrong. You are basically attacking Veeam for existing. Giving them literally no "out". Their product was flawless. That their customers might choose a platform that isn't working or a patch level that isn't working - what do you expect from Veeam? You are giving them no means of not having been wrong. You'd hate them just as much or more had they dropped VMware completely. Imagine if they did that, the discussion we would be having here about how could they abandon VMware users would be even worse.
This line is completely unfair and unreasonable to all the vendors caught by the situation.