VMQ issues/Veeam/Windows Server 2019...



  • Hi folks,

    I am having VMQ issues on my 3 x Windows Server 2019 host. I am using a team of Intel X550 NIC (3 x 10 GbE interfaces to 3 x Stacked switch). The Team has been configured using Switch Independent mode.

    Now, VMs are on the LAN perfectly and communicate. Rock solid.

    Now to the issue. When I use Veeam with Application Aware turned on, the hosts get stuck creating checkpoints for VMs at 9%. The only option is to kill the host, restart and turn the VMs back on. It happens every time. (Strangely, checkpoints run fine native).

    Veeam have looked at the logs and have said its an issue they have seen before, and to turn off VMQ initially on NICs and VMs as that often solves the issue. No worries. I do that, and the backups then run fine - perfect!

    However, VMs over time then drop off of the network. I can connect to them in Hyper-V, but nothing I do will bring them back on the network. Initially they are on the network, just at some point in time many drop off, whilst others stay on.

    The only resolution is to turn VMQ back on and reboot. I cant really keep testing this either as it causes much downtime! Not good. Of course, when VMQ is back on... Application Aware backups then fail and kill the host like I said at the start 😕

    Any idea why some VMs drop off of the network with VMQ disabled on the VM and NIC? Host, fully patched. NIC, latest firmware. I thought you didnt have to use VMQ...

    Best,
    Jim



  • @Jimmy9008 said in VMQ issues/Veeam/Windows Server 2019...:

    Hi folks,

    I am having VMQ issues on my 3 x Windows Server 2019 host. I am using a team of Intel X550 NIC (3 x 10 GbE interfaces to 3 x Stacked switch). The Team has been configured using Switch Independent mode.

    Now, VMs are on the LAN perfectly and communicate. Rock solid.

    Now to the issue. When I use Veeam with Application Aware turned on, the hosts get stuck creating checkpoints for VMs at 9%. The only option is to kill the host, restart and turn the VMs back on. It happens every time. (Strangely, checkpoints run fine native).

    Veeam have looked at the logs and have said its an issue they have seen before, and to turn off VMQ initially on NICs and VMs as that often solves the issue. No worries. I do that, and the backups then run fine - perfect!

    However, VMs over time then drop off of the network. I can connect to them in Hyper-V, but nothing I do will bring them back on the network. Initially they are on the network, just at some point in time many drop off, whilst others stay on.

    The only resolution is to turn VMQ back on and reboot. I cant really keep testing this either as it causes much downtime! Not good. Of course, when VMQ is back on... Application Aware backups then fail and kill the host like I said at the start 😕

    Any idea why some VMs drop off of the network with VMQ disabled on the VM and NIC? Host, fully patched. NIC, latest firmware. I thought you didnt have to use VMQ...

    Best,
    Jim

    Is the the same VMs that drop off the network every time?

    What OS are the VMs that get dropped off the network?

    What kind of switches are you running?

    For Windows systems (Even sometimes a Physical Server, I usually have to do this in regedit:

    2086c18b-bd54-4d02-aeb3-4d91a5f3e7b6-image.png

    You'll have to reboot each VM or host after you make the change.

    I haven't run into anything like that with our Linux systems. You may need to do it at the Host level on HyperV as well.



  • @dafyre said in VMQ issues/Veeam/Windows Server 2019...:

    @Jimmy9008 said in VMQ issues/Veeam/Windows Server 2019...:

    Hi folks,

    I am having VMQ issues on my 3 x Windows Server 2019 host. I am using a team of Intel X550 NIC (3 x 10 GbE interfaces to 3 x Stacked switch). The Team has been configured using Switch Independent mode.

    Now, VMs are on the LAN perfectly and communicate. Rock solid.

    Now to the issue. When I use Veeam with Application Aware turned on, the hosts get stuck creating checkpoints for VMs at 9%. The only option is to kill the host, restart and turn the VMs back on. It happens every time. (Strangely, checkpoints run fine native).

    Veeam have looked at the logs and have said its an issue they have seen before, and to turn off VMQ initially on NICs and VMs as that often solves the issue. No worries. I do that, and the backups then run fine - perfect!

    However, VMs over time then drop off of the network. I can connect to them in Hyper-V, but nothing I do will bring them back on the network. Initially they are on the network, just at some point in time many drop off, whilst others stay on.

    The only resolution is to turn VMQ back on and reboot. I cant really keep testing this either as it causes much downtime! Not good. Of course, when VMQ is back on... Application Aware backups then fail and kill the host like I said at the start 😕

    Any idea why some VMs drop off of the network with VMQ disabled on the VM and NIC? Host, fully patched. NIC, latest firmware. I thought you didnt have to use VMQ...

    Best,
    Jim

    Is the the same VMs that drop off the network every time?

    What OS are the VMs that get dropped off the network?

    What kind of switches are you running?

    1. Uncertain. I've noticed a few are the same, but I've had no time to really look in to it. Too many users. I'm trying to get ideas that could be the issue to do proper testing with downtime around Dec 20th when a lot are on holiday.

    2. I think 2012 R2. Possibly 2016 too. Can't recall any 2019.

    3. Dell N4064 Stack.



  • @Jimmy9008 said in VMQ issues/Veeam/Windows Server 2019...:

    @dafyre said in VMQ issues/Veeam/Windows Server 2019...:

    @Jimmy9008 said in VMQ issues/Veeam/Windows Server 2019...:

    Hi folks,

    I am having VMQ issues on my 3 x Windows Server 2019 host. I am using a team of Intel X550 NIC (3 x 10 GbE interfaces to 3 x Stacked switch). The Team has been configured using Switch Independent mode.

    Now, VMs are on the LAN perfectly and communicate. Rock solid.

    Now to the issue. When I use Veeam with Application Aware turned on, the hosts get stuck creating checkpoints for VMs at 9%. The only option is to kill the host, restart and turn the VMs back on. It happens every time. (Strangely, checkpoints run fine native).

    Veeam have looked at the logs and have said its an issue they have seen before, and to turn off VMQ initially on NICs and VMs as that often solves the issue. No worries. I do that, and the backups then run fine - perfect!

    However, VMs over time then drop off of the network. I can connect to them in Hyper-V, but nothing I do will bring them back on the network. Initially they are on the network, just at some point in time many drop off, whilst others stay on.

    The only resolution is to turn VMQ back on and reboot. I cant really keep testing this either as it causes much downtime! Not good. Of course, when VMQ is back on... Application Aware backups then fail and kill the host like I said at the start 😕

    Any idea why some VMs drop off of the network with VMQ disabled on the VM and NIC? Host, fully patched. NIC, latest firmware. I thought you didnt have to use VMQ...

    Best,
    Jim

    Is the the same VMs that drop off the network every time?

    What OS are the VMs that get dropped off the network?

    What kind of switches are you running?

    1. Uncertain. I've noticed a few are the same, but I've had no time to really look in to it. Too many users. I'm trying to get ideas that could be the issue to do proper testing with downtime around Dec 20th when a lot are on holiday.

    2. I think 2012 R2. Possibly 2016 too. Can't recall any 2019.

    3. Dell N4064 Stack.

    Try the ArpRetryCount trick. You'll have to reboot each VM or host after you make the change.

    Edit: Basically, this stops windows from checking if there's a duplicate IP address in use.



  • @dafyre said in VMQ issues/Veeam/Windows Server 2019...:

    @Jimmy9008 said in VMQ issues/Veeam/Windows Server 2019...:

    Hi folks,

    I am having VMQ issues on my 3 x Windows Server 2019 host. I am using a team of Intel X550 NIC (3 x 10 GbE interfaces to 3 x Stacked switch). The Team has been configured using Switch Independent mode.

    Now, VMs are on the LAN perfectly and communicate. Rock solid.

    Now to the issue. When I use Veeam with Application Aware turned on, the hosts get stuck creating checkpoints for VMs at 9%. The only option is to kill the host, restart and turn the VMs back on. It happens every time. (Strangely, checkpoints run fine native).

    Veeam have looked at the logs and have said its an issue they have seen before, and to turn off VMQ initially on NICs and VMs as that often solves the issue. No worries. I do that, and the backups then run fine - perfect!

    However, VMs over time then drop off of the network. I can connect to them in Hyper-V, but nothing I do will bring them back on the network. Initially they are on the network, just at some point in time many drop off, whilst others stay on.

    The only resolution is to turn VMQ back on and reboot. I cant really keep testing this either as it causes much downtime! Not good. Of course, when VMQ is back on... Application Aware backups then fail and kill the host like I said at the start 😕

    Any idea why some VMs drop off of the network with VMQ disabled on the VM and NIC? Host, fully patched. NIC, latest firmware. I thought you didnt have to use VMQ...

    Best,
    Jim

    Is the the same VMs that drop off the network every time?

    What OS are the VMs that get dropped off the network?

    What kind of switches are you running?

    For Windows systems (Even sometimes a Physical Server, I usually have to do this in regedit:

    2086c18b-bd54-4d02-aeb3-4d91a5f3e7b6-image.png

    I haven't run into anything like that with our Linux systems. You may need to do it at the Host level on HyperV as well.

    That's something I can try in December. So do that to the VMs and hosts?



  • @Jimmy9008 said in VMQ issues/Veeam/Windows Server 2019...:

    @dafyre said in VMQ issues/Veeam/Windows Server 2019...:

    @Jimmy9008 said in VMQ issues/Veeam/Windows Server 2019...:

    Hi folks,

    I am having VMQ issues on my 3 x Windows Server 2019 host. I am using a team of Intel X550 NIC (3 x 10 GbE interfaces to 3 x Stacked switch). The Team has been configured using Switch Independent mode.

    Now, VMs are on the LAN perfectly and communicate. Rock solid.

    Now to the issue. When I use Veeam with Application Aware turned on, the hosts get stuck creating checkpoints for VMs at 9%. The only option is to kill the host, restart and turn the VMs back on. It happens every time. (Strangely, checkpoints run fine native).

    Veeam have looked at the logs and have said its an issue they have seen before, and to turn off VMQ initially on NICs and VMs as that often solves the issue. No worries. I do that, and the backups then run fine - perfect!

    However, VMs over time then drop off of the network. I can connect to them in Hyper-V, but nothing I do will bring them back on the network. Initially they are on the network, just at some point in time many drop off, whilst others stay on.

    The only resolution is to turn VMQ back on and reboot. I cant really keep testing this either as it causes much downtime! Not good. Of course, when VMQ is back on... Application Aware backups then fail and kill the host like I said at the start 😕

    Any idea why some VMs drop off of the network with VMQ disabled on the VM and NIC? Host, fully patched. NIC, latest firmware. I thought you didnt have to use VMQ...

    Best,
    Jim

    Is the the same VMs that drop off the network every time?

    What OS are the VMs that get dropped off the network?

    What kind of switches are you running?

    For Windows systems (Even sometimes a Physical Server, I usually have to do this in regedit:

    2086c18b-bd54-4d02-aeb3-4d91a5f3e7b6-image.png

    I haven't run into anything like that with our Linux systems. You may need to do it at the Host level on HyperV as well.

    That's something I can try in December. So do that to the VMs and hosts?

    Yeah. Or you can try it the next time a VM goes down. That's what we did. Make the change, reboot the VM, and it's usually happy again.



  • @dafyre said in VMQ issues/Veeam/Windows Server 2019...:

    @Jimmy9008 said in VMQ issues/Veeam/Windows Server 2019...:

    @dafyre said in VMQ issues/Veeam/Windows Server 2019...:

    @Jimmy9008 said in VMQ issues/Veeam/Windows Server 2019...:

    Hi folks,

    I am having VMQ issues on my 3 x Windows Server 2019 host. I am using a team of Intel X550 NIC (3 x 10 GbE interfaces to 3 x Stacked switch). The Team has been configured using Switch Independent mode.

    Now, VMs are on the LAN perfectly and communicate. Rock solid.

    Now to the issue. When I use Veeam with Application Aware turned on, the hosts get stuck creating checkpoints for VMs at 9%. The only option is to kill the host, restart and turn the VMs back on. It happens every time. (Strangely, checkpoints run fine native).

    Veeam have looked at the logs and have said its an issue they have seen before, and to turn off VMQ initially on NICs and VMs as that often solves the issue. No worries. I do that, and the backups then run fine - perfect!

    However, VMs over time then drop off of the network. I can connect to them in Hyper-V, but nothing I do will bring them back on the network. Initially they are on the network, just at some point in time many drop off, whilst others stay on.

    The only resolution is to turn VMQ back on and reboot. I cant really keep testing this either as it causes much downtime! Not good. Of course, when VMQ is back on... Application Aware backups then fail and kill the host like I said at the start 😕

    Any idea why some VMs drop off of the network with VMQ disabled on the VM and NIC? Host, fully patched. NIC, latest firmware. I thought you didnt have to use VMQ...

    Best,
    Jim

    Is the the same VMs that drop off the network every time?

    What OS are the VMs that get dropped off the network?

    What kind of switches are you running?

    1. Uncertain. I've noticed a few are the same, but I've had no time to really look in to it. Too many users. I'm trying to get ideas that could be the issue to do proper testing with downtime around Dec 20th when a lot are on holiday.

    2. I think 2012 R2. Possibly 2016 too. Can't recall any 2019.

    3. Dell N4064 Stack.

    Try the ArpRetryCount trick. You'll have to reboot each VM or host after you make the change.

    Ok, I can try that. So:

    1. disable VMQ physical
    2. disable VMQ on all VM NIC
    3. change registry
    4. reboot

    Sound about right?



  • @dafyre said in VMQ issues/Veeam/Windows Server 2019...:

    @Jimmy9008 said in VMQ issues/Veeam/Windows Server 2019...:

    @dafyre said in VMQ issues/Veeam/Windows Server 2019...:

    @Jimmy9008 said in VMQ issues/Veeam/Windows Server 2019...:

    Hi folks,

    I am having VMQ issues on my 3 x Windows Server 2019 host. I am using a team of Intel X550 NIC (3 x 10 GbE interfaces to 3 x Stacked switch). The Team has been configured using Switch Independent mode.

    Now, VMs are on the LAN perfectly and communicate. Rock solid.

    Now to the issue. When I use Veeam with Application Aware turned on, the hosts get stuck creating checkpoints for VMs at 9%. The only option is to kill the host, restart and turn the VMs back on. It happens every time. (Strangely, checkpoints run fine native).

    Veeam have looked at the logs and have said its an issue they have seen before, and to turn off VMQ initially on NICs and VMs as that often solves the issue. No worries. I do that, and the backups then run fine - perfect!

    However, VMs over time then drop off of the network. I can connect to them in Hyper-V, but nothing I do will bring them back on the network. Initially they are on the network, just at some point in time many drop off, whilst others stay on.

    The only resolution is to turn VMQ back on and reboot. I cant really keep testing this either as it causes much downtime! Not good. Of course, when VMQ is back on... Application Aware backups then fail and kill the host like I said at the start 😕

    Any idea why some VMs drop off of the network with VMQ disabled on the VM and NIC? Host, fully patched. NIC, latest firmware. I thought you didnt have to use VMQ...

    Best,
    Jim

    Is the the same VMs that drop off the network every time?

    What OS are the VMs that get dropped off the network?

    What kind of switches are you running?

    For Windows systems (Even sometimes a Physical Server, I usually have to do this in regedit:

    2086c18b-bd54-4d02-aeb3-4d91a5f3e7b6-image.png

    I haven't run into anything like that with our Linux systems. You may need to do it at the Host level on HyperV as well.

    That's something I can try in December. So do that to the VMs and hosts?

    Yeah. Or you can try it the next time a VM goes down. That's what we did. Make the change, reboot the VM, and it's usually happy again.

    That's the thing. The VM are rock solid with VMQ and Veeam, just only when no App Aware backups are ran.



  • @Jimmy9008 said in VMQ issues/Veeam/Windows Server 2019...:

    @dafyre said in VMQ issues/Veeam/Windows Server 2019...:

    @Jimmy9008 said in VMQ issues/Veeam/Windows Server 2019...:

    @dafyre said in VMQ issues/Veeam/Windows Server 2019...:

    @Jimmy9008 said in VMQ issues/Veeam/Windows Server 2019...:

    Hi folks,

    I am having VMQ issues on my 3 x Windows Server 2019 host. I am using a team of Intel X550 NIC (3 x 10 GbE interfaces to 3 x Stacked switch). The Team has been configured using Switch Independent mode.

    Now, VMs are on the LAN perfectly and communicate. Rock solid.

    Now to the issue. When I use Veeam with Application Aware turned on, the hosts get stuck creating checkpoints for VMs at 9%. The only option is to kill the host, restart and turn the VMs back on. It happens every time. (Strangely, checkpoints run fine native).

    Veeam have looked at the logs and have said its an issue they have seen before, and to turn off VMQ initially on NICs and VMs as that often solves the issue. No worries. I do that, and the backups then run fine - perfect!

    However, VMs over time then drop off of the network. I can connect to them in Hyper-V, but nothing I do will bring them back on the network. Initially they are on the network, just at some point in time many drop off, whilst others stay on.

    The only resolution is to turn VMQ back on and reboot. I cant really keep testing this either as it causes much downtime! Not good. Of course, when VMQ is back on... Application Aware backups then fail and kill the host like I said at the start 😕

    Any idea why some VMs drop off of the network with VMQ disabled on the VM and NIC? Host, fully patched. NIC, latest firmware. I thought you didnt have to use VMQ...

    Best,
    Jim

    Is the the same VMs that drop off the network every time?

    What OS are the VMs that get dropped off the network?

    What kind of switches are you running?

    1. Uncertain. I've noticed a few are the same, but I've had no time to really look in to it. Too many users. I'm trying to get ideas that could be the issue to do proper testing with downtime around Dec 20th when a lot are on holiday.

    2. I think 2012 R2. Possibly 2016 too. Can't recall any 2019.

    3. Dell N4064 Stack.

    Try the ArpRetryCount trick. You'll have to reboot each VM or host after you make the change.

    Ok, I can try that. So:

    1. disable VMQ physical
    2. disable VMQ on all VM NIC
    3. change registry
    4. reboot

    Sound about right?

    1. change the registry on any machine that is down.

    But yeah, that sounds about right.



  • @dafyre said in VMQ issues/Veeam/Windows Server 2019...:

    @Jimmy9008 said in VMQ issues/Veeam/Windows Server 2019...:

    @dafyre said in VMQ issues/Veeam/Windows Server 2019...:

    @Jimmy9008 said in VMQ issues/Veeam/Windows Server 2019...:

    @dafyre said in VMQ issues/Veeam/Windows Server 2019...:

    @Jimmy9008 said in VMQ issues/Veeam/Windows Server 2019...:

    Hi folks,

    I am having VMQ issues on my 3 x Windows Server 2019 host. I am using a team of Intel X550 NIC (3 x 10 GbE interfaces to 3 x Stacked switch). The Team has been configured using Switch Independent mode.

    Now, VMs are on the LAN perfectly and communicate. Rock solid.

    Now to the issue. When I use Veeam with Application Aware turned on, the hosts get stuck creating checkpoints for VMs at 9%. The only option is to kill the host, restart and turn the VMs back on. It happens every time. (Strangely, checkpoints run fine native).

    Veeam have looked at the logs and have said its an issue they have seen before, and to turn off VMQ initially on NICs and VMs as that often solves the issue. No worries. I do that, and the backups then run fine - perfect!

    However, VMs over time then drop off of the network. I can connect to them in Hyper-V, but nothing I do will bring them back on the network. Initially they are on the network, just at some point in time many drop off, whilst others stay on.

    The only resolution is to turn VMQ back on and reboot. I cant really keep testing this either as it causes much downtime! Not good. Of course, when VMQ is back on... Application Aware backups then fail and kill the host like I said at the start 😕

    Any idea why some VMs drop off of the network with VMQ disabled on the VM and NIC? Host, fully patched. NIC, latest firmware. I thought you didnt have to use VMQ...

    Best,
    Jim

    Is the the same VMs that drop off the network every time?

    What OS are the VMs that get dropped off the network?

    What kind of switches are you running?

    1. Uncertain. I've noticed a few are the same, but I've had no time to really look in to it. Too many users. I'm trying to get ideas that could be the issue to do proper testing with downtime around Dec 20th when a lot are on holiday.

    2. I think 2012 R2. Possibly 2016 too. Can't recall any 2019.

    3. Dell N4064 Stack.

    Try the ArpRetryCount trick. You'll have to reboot each VM or host after you make the change.

    Ok, I can try that. So:

    1. disable VMQ physical
    2. disable VMQ on all VM NIC
    3. change registry
    4. reboot

    Sound about right?

    1. change the registry on any machine that is down.

    But yeah, that sounds about right.

    Do I need to change the reg on the host too?



  • @Jimmy9008 said in VMQ issues/Veeam/Windows Server 2019...:

    @dafyre said in VMQ issues/Veeam/Windows Server 2019...:

    @Jimmy9008 said in VMQ issues/Veeam/Windows Server 2019...:

    @dafyre said in VMQ issues/Veeam/Windows Server 2019...:

    @Jimmy9008 said in VMQ issues/Veeam/Windows Server 2019...:

    @dafyre said in VMQ issues/Veeam/Windows Server 2019...:

    @Jimmy9008 said in VMQ issues/Veeam/Windows Server 2019...:

    Hi folks,

    I am having VMQ issues on my 3 x Windows Server 2019 host. I am using a team of Intel X550 NIC (3 x 10 GbE interfaces to 3 x Stacked switch). The Team has been configured using Switch Independent mode.

    Now, VMs are on the LAN perfectly and communicate. Rock solid.

    Now to the issue. When I use Veeam with Application Aware turned on, the hosts get stuck creating checkpoints for VMs at 9%. The only option is to kill the host, restart and turn the VMs back on. It happens every time. (Strangely, checkpoints run fine native).

    Veeam have looked at the logs and have said its an issue they have seen before, and to turn off VMQ initially on NICs and VMs as that often solves the issue. No worries. I do that, and the backups then run fine - perfect!

    However, VMs over time then drop off of the network. I can connect to them in Hyper-V, but nothing I do will bring them back on the network. Initially they are on the network, just at some point in time many drop off, whilst others stay on.

    The only resolution is to turn VMQ back on and reboot. I cant really keep testing this either as it causes much downtime! Not good. Of course, when VMQ is back on... Application Aware backups then fail and kill the host like I said at the start 😕

    Any idea why some VMs drop off of the network with VMQ disabled on the VM and NIC? Host, fully patched. NIC, latest firmware. I thought you didnt have to use VMQ...

    Best,
    Jim

    Is the the same VMs that drop off the network every time?

    What OS are the VMs that get dropped off the network?

    What kind of switches are you running?

    1. Uncertain. I've noticed a few are the same, but I've had no time to really look in to it. Too many users. I'm trying to get ideas that could be the issue to do proper testing with downtime around Dec 20th when a lot are on holiday.

    2. I think 2012 R2. Possibly 2016 too. Can't recall any 2019.

    3. Dell N4064 Stack.

    Try the ArpRetryCount trick. You'll have to reboot each VM or host after you make the change.

    Ok, I can try that. So:

    1. disable VMQ physical
    2. disable VMQ on all VM NIC
    3. change registry
    4. reboot

    Sound about right?

    1. change the registry on any machine that is down.

    But yeah, that sounds about right.

    Do I need to change the reg on the host too?

    I would go ahead and do it, yeah.


Log in to reply