I am having VMQ issues on my 3 x Windows Server 2019 host. I am using a team of Intel X550 NIC (3 x 10 GbE interfaces to 3 x Stacked switch). The Team has been configured using Switch Independent mode.
Now, VMs are on the LAN perfectly and communicate. Rock solid.
Now to the issue. When I use Veeam with Application Aware turned on, the hosts get stuck creating checkpoints for VMs at 9%. The only option is to kill the host, restart and turn the VMs back on. It happens every time. (Strangely, checkpoints run fine native).
Veeam have looked at the logs and have said its an issue they have seen before, and to turn off VMQ initially on NICs and VMs as that often solves the issue. No worries. I do that, and the backups then run fine - perfect!
However, VMs over time then drop off of the network. I can connect to them in Hyper-V, but nothing I do will bring them back on the network. Initially they are on the network, just at some point in time many drop off, whilst others stay on.
The only resolution is to turn VMQ back on and reboot. I cant really keep testing this either as it causes much downtime! Not good. Of course, when VMQ is back on... Application Aware backups then fail and kill the host like I said at the start
Any idea why some VMs drop off of the network with VMQ disabled on the VM and NIC? Host, fully patched. NIC, latest firmware. I thought you didnt have to use VMQ...
Is the the same VMs that drop off the network every time?
What OS are the VMs that get dropped off the network?
What kind of switches are you running?
Uncertain. I've noticed a few are the same, but I've had no time to really look in to it. Too many users. I'm trying to get ideas that could be the issue to do proper testing with downtime around Dec 20th when a lot are on holiday.
I think 2012 R2. Possibly 2016 too. Can't recall any 2019.
Dell N4064 Stack.
Try the ArpRetryCount trick. You'll have to reboot each VM or host after you make the change.
Ok, I can try that. So:
- disable VMQ physical
- disable VMQ on all VM NIC
- change registry
Sound about right?
- change the registry on any machine that is down.
But yeah, that sounds about right.
Do I need to change the reg on the host too?