hyper-v bad physical NIC? - vswitch or NIC teaming?



  • I have a server about to go in to production, but the NIC keeps failing after 12-24 hours. I thought it might be the physical network switch, so I changed out the switch and it's still just the NIC on the server going down. (all the other client network connections stay up) The server has two NICs so I figured I would just use the other physical NIC and see if it stays up.

    The server is running Hyper-V core 2016. When the NIC goes down I lose access to the host as well.

    Given that, would it make the most sense to create a new vSwitch with the second NIC and add that to the VM and see if it stays up like that, or would NIC teaming be a better way to go?



  • If you have a known bad NIC can you get it replaced on warranty?



  • @coliver yes. The server is brand new. The issue however is that the server is in a secure environment and remote control isn't allowed. I'm not even sure if a tech can be escorted in. If the work around doesn't work, it might mean picking up a new server or adding a NIC card.



  • @coliver I really want to troubleshoot to see if both NICs are bad, or what the deal is.



  • @mike-davis said in hyper-v bad physical NIC? - vswitch or NIC teaming?:

    @coliver yes. The server is brand new. The issue however is that the server is in a secure environment and remote control isn't allowed. I'm not even sure if a tech can be escorted in. If the work around doesn't work, it might mean picking up a new server or adding a NIC card.

    I don't know if I would deploy something with known bad hardware into an environment that a tech would have trouble accessing.

    That being said I don't know if I would use NIC teaming unless you had two known good NICs.



  • @coliver said in hyper-v bad physical NIC? - vswitch or NIC teaming?:

    I don't know if I would deploy something with known bad hardware into an environment that a tech would have trouble accessing.

    I don't want to either. I need to be absolutely sure what exactly is wrong. I shouldn't have said work around in the sense that it would be permanent. More of something like cut it over to the second NIC and make sure it's not a software bug.



  • @mike-davis said in hyper-v bad physical NIC? - vswitch or NIC teaming?:

    @coliver said in hyper-v bad physical NIC? - vswitch or NIC teaming?:

    I don't know if I would deploy something with known bad hardware into an environment that a tech would have trouble accessing.

    I don't want to either. I need to be absolutely sure what exactly is wrong. I shouldn't have said work around in the sense that it would be permanent. More of something like cut it over to the second NIC and make sure it's not a software bug.

    That's a good way to test. But I wouldn't deploy it like that.



  • Is the NIC drivers up to date?



  • @black3dynamite said in hyper-v bad physical NIC? - vswitch or NIC teaming?:

    Is the NIC drivers up to date?

    First thing I did was update the firmware for everything and all the drivers.



  • Other troubleshooting that I did was cranked up iperf and ran a few GB through it. It didn't drop a beat.



  • @mike-davis said in hyper-v bad physical NIC? - vswitch or NIC teaming?:

    @black3dynamite said in hyper-v bad physical NIC? - vswitch or NIC teaming?:

    Is the NIC drivers up to date?

    First thing I did was update the firmware for everything and all the drivers.

    Its possible Virtual Machine Queue (VMQ) could be causing the issue. Intel or Broadcom?



  • The only other thing I can think of is disabling the power management settings on the NIC, but I'm pretty sure I did that with the:
    Disable-NetAdapterPowerManagement -Name Ethernet
    command. The command
    Get-NetAdapterPowerManagement -Name Ethernet
    doesn't seem to show if it has been enabled or disabled, only the Wake on LAN. Does anyone know if there is another command to check the current status?



  • @mike-davis said in hyper-v bad physical NIC? - vswitch or NIC teaming?:

    The only other thing I can think of is disabling the power management settings on the NIC, but I'm pretty sure I did that with the:
    Disable-NetAdapterPowerManagement -Name Ethernet
    command. The command
    Get-NetAdapterPowerManagement -Name Ethernet
    doesn't seem to show if it has been enabled or disabled, only the Wake on LAN. Does anyone know if there is another command to check the current status?

    PS C:\Windows\system32> powercfg /list
    
    Existing Power Schemes (* Active)
    -----------------------------------
    Power Scheme GUID: 381b4222-f694-41f0-9685-ff5bb260df2e  (Balanced)
    Power Scheme GUID: 8c5e7fda-e8bf-4a96-9a85-a6e23a8c635c  (High performance) *
    Power Scheme GUID: a1841308-3541-4fab-bc81-f71556f20b4a  (Power saver)
    


  • @black3dynamite said in hyper-v bad physical NIC? - vswitch or NIC teaming?:

    Its possible Virtual Machine Queue (VMQ) could be causing the issue. Intel or Broadcom?

    I'm not sure. I'll have to be onsite to find out. What are the symptoms of that? Would VMQ knock the host offline as well?



  • @black3dynamite said in hyper-v bad physical NIC? - vswitch or NIC teaming?:

    @mike-davis said in hyper-v bad physical NIC? - vswitch or NIC teaming?:

    The only other thing I can think of is disabling the power management settings on the NIC, but I'm pretty sure I did that with the:
    Disable-NetAdapterPowerManagement -Name Ethernet
    command. The command
    Get-NetAdapterPowerManagement -Name Ethernet
    doesn't seem to show if it has been enabled or disabled, only the Wake on LAN. Does anyone know if there is another command to check the current status?

    PS C:\Windows\system32> powercfg /list
    
    Existing Power Schemes (* Active)
    -----------------------------------
    Power Scheme GUID: 381b4222-f694-41f0-9685-ff5bb260df2e  (Balanced)
    Power Scheme GUID: 8c5e7fda-e8bf-4a96-9a85-a6e23a8c635c  (High performance) *
    Power Scheme GUID: a1841308-3541-4fab-bc81-f71556f20b4a  (Power saver)
    

    That command may work for some stuff on the host, but I tested it on another server and it doesn't change the power settings for the individual NICs. cool command though.



  • @mike-davis said in hyper-v bad physical NIC? - vswitch or NIC teaming?:

    @black3dynamite said in hyper-v bad physical NIC? - vswitch or NIC teaming?:

    @mike-davis said in hyper-v bad physical NIC? - vswitch or NIC teaming?:

    The only other thing I can think of is disabling the power management settings on the NIC, but I'm pretty sure I did that with the:
    Disable-NetAdapterPowerManagement -Name Ethernet
    command. The command
    Get-NetAdapterPowerManagement -Name Ethernet
    doesn't seem to show if it has been enabled or disabled, only the Wake on LAN. Does anyone know if there is another command to check the current status?

    PS C:\Windows\system32> powercfg /list
    
    Existing Power Schemes (* Active)
    -----------------------------------
    Power Scheme GUID: 381b4222-f694-41f0-9685-ff5bb260df2e  (Balanced)
    Power Scheme GUID: 8c5e7fda-e8bf-4a96-9a85-a6e23a8c635c  (High performance) *
    Power Scheme GUID: a1841308-3541-4fab-bc81-f71556f20b4a  (Power saver)
    

    That command may work for some stuff on the host, but I tested it on another server and it doesn't change the power settings for the individual NICs. cool command though.

    Which power scheme is active? I've always make sure mine is set to High performance.



  • @mike-davis said in hyper-v bad physical NIC? - vswitch or NIC teaming?:

    @black3dynamite said in hyper-v bad physical NIC? - vswitch or NIC teaming?:

    Its possible Virtual Machine Queue (VMQ) could be causing the issue. Intel or Broadcom?

    I'm not sure. I'll have to be onsite to find out. What are the symptoms of that? Would VMQ knock the host offline as well?

    Yes. This is a well known issue in an older firmware of some 1gig nics, such as broadcoms. Updating the firmware resolves the issue, or turning off VMQ, which you can do via powershell on Hyper-V Server 2016.

    VMQ is meant for 10gig+ NICs, but the firmware is messed up and has it on by default on some 1gig nics.



  • @black3dynamite said in hyper-v bad physical NIC? - vswitch or NIC teaming?:

    Which power scheme is active? I've always make sure mine is set to High performance.

    High performance. On the host I checked, it's a GUI install, so I went to the nic properties and I can still see Power Management tab and the box for "Allow the computer to turn off this device to save power" is still checked.

    When you do: Disable-NetAdapterPowerManagement -Name Ethernet
    it removes that tab.



  • @tim_g said in hyper-v bad physical NIC? - vswitch or NIC teaming?:

    @mike-davis said in hyper-v bad physical NIC? - vswitch or NIC teaming?:

    @black3dynamite said in hyper-v bad physical NIC? - vswitch or NIC teaming?:

    Its possible Virtual Machine Queue (VMQ) could be causing the issue. Intel or Broadcom?

    I'm not sure. I'll have to be onsite to find out. What are the symptoms of that? Would VMQ knock the host offline as well?

    Yes. This is a well known issue in an older firmware of some 1gig nics, such as broadcoms. Updating the firmware resolves the issue, or turning off VMQ, which you can do via powershell on Hyper-V Server 2016.

    VMQ is meant for 10gig NICs, but the firmware is messed up and has it on by default on some 1gig nics.

    Here's some PS to get you started:

    Get-NetAdapter
    
    Get-NetAdapterAdvancedProperty NIC1
    
    Get-NetAdapterAdvancedProperty * -DisplayName “Virtual Machine Queues”
    
    Set-NetAdapterAdvancedProperty * -DisplayName “Virtual Machine Queues” -DisplayValue Disabled
    


  • @tim_g Thanks for the commands. If I get the output below, does that mean my NICs don't have the option for VMQ?

    PS C:\> Get-NetAdapter
    
    Name                      InterfaceDescription                    ifIndex Statu
                                                                              s
    ----                      --------------------                    ------- -----
    vEthernet (vSwitch02)     Hyper-V Virtual Ethernet Adapter #3          29 Up
    vEthernet (Broadcom BC... Hyper-V Virtual Ethernet Adapter #2          18 Up
    Ethernet 2                HP NC373i Multifunction Gigabit S...#40      13 Up
    Ethernet                  HP NC373i Multifunction Gigabit S...#39      12 Di...
    
    
    PS C:\> Get-NetAdapterAdvancedProperty Ethernet
    
    Name                      DisplayName                    DisplayValue
    ----                      -----------                    ------------
    Ethernet                  Flow Control                   Auto
    Ethernet                  Interrupt Moderation           Enabled
    Ethernet                  Jumbo Packet                   1514
    Ethernet                  Large Send Offload V2 (IPv4)   Enabled
    Ethernet                  Maximum Number of RSS Queues   2
    Ethernet                  Priority & VLAN                Priority & VLAN ena...
    Ethernet                  Receive Buffers (0=Auto)       0
    Ethernet                  Receive Side Scaling           Enabled
    Ethernet                  Speed & Duplex                 Auto Negotiation
    Ethernet                  TCP Connection Offload (IPv4)  Disabled
    Ethernet                  TCP/UDP Checksum Offload (I... Rx & Tx Enabled
    Ethernet                  Transmit Buffers (0=Auto)      0
    Ethernet                  Wake On Magic Packet           Disabled
    Ethernet                  Wake On Pattern Match          Disabled
    Ethernet                  Locally Administered Address   --
    Ethernet                  VLAN ID                        0
    Ethernet                  [email protected]             Enabled
    
    
    PS C:\> Get-NetAdapterAdvancedProperty * -DisplayName "Virtual Machine Queues"
    Get-NetAdapterAdvancedProperty : No matching
    MSFT_NetAdapterAdvancedPropertySettingData objects found by CIM query for
    instances of the ROOT/StandardCimv2/MSFT_NetAdapterAdvancedPropertySettingData
    class on the  CIM server: SELECT * FROM
    MSFT_NetAdapterAdvancedPropertySettingData  WHERE ((Name LIKE '%')) AND
    ((DisplayName LIKE 'Virtual Machine Queues')). Verify query parameters and
    retry.
    At line:1 char:1
    + Get-NetAdapterAdvancedProperty * -DisplayName "Virtual Machine Queues"
    + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        + CategoryInfo          : ObjectNotFound: (MSFT_NetAdapter...ertySettingDa
       ta:String) [Get-NetAdapterAdvancedProperty], CimJobException
        + FullyQualifiedErrorId : CmdletizationQuery_NotFound,Get-NetAdapterAdvanc
       edProperty
    
    PS C:\>
    


  • @mike-davis said in hyper-v bad physical NIC? - vswitch or NIC teaming?:

    @tim_g Thanks for the commands. If I get the output below, does that mean my NICs don't have the option for VMQ?

    PS C:\> Get-NetAdapter
    
    Name                      InterfaceDescription                    ifIndex Statu
                                                                              s
    ----                      --------------------                    ------- -----
    vEthernet (vSwitch02)     Hyper-V Virtual Ethernet Adapter #3          29 Up
    vEthernet (Broadcom BC... Hyper-V Virtual Ethernet Adapter #2          18 Up
    Ethernet 2                HP NC373i Multifunction Gigabit S...#40      13 Up
    Ethernet                  HP NC373i Multifunction Gigabit S...#39      12 Di...
    
    
    PS C:\> Get-NetAdapterAdvancedProperty Ethernet
    
    Name                      DisplayName                    DisplayValue
    ----                      -----------                    ------------
    Ethernet                  Flow Control                   Auto
    Ethernet                  Interrupt Moderation           Enabled
    Ethernet                  Jumbo Packet                   1514
    Ethernet                  Large Send Offload V2 (IPv4)   Enabled
    Ethernet                  Maximum Number of RSS Queues   2
    Ethernet                  Priority & VLAN                Priority & VLAN ena...
    Ethernet                  Receive Buffers (0=Auto)       0
    Ethernet                  Receive Side Scaling           Enabled
    Ethernet                  Speed & Duplex                 Auto Negotiation
    Ethernet                  TCP Connection Offload (IPv4)  Disabled
    Ethernet                  TCP/UDP Checksum Offload (I... Rx & Tx Enabled
    Ethernet                  Transmit Buffers (0=Auto)      0
    Ethernet                  Wake On Magic Packet           Disabled
    Ethernet                  Wake On Pattern Match          Disabled
    Ethernet                  Locally Administered Address   --
    Ethernet                  VLAN ID                        0
    Ethernet                  [email protected]             Enabled
    
    
    PS C:\> Get-NetAdapterAdvancedProperty * -DisplayName "Virtual Machine Queues"
    Get-NetAdapterAdvancedProperty : No matching
    MSFT_NetAdapterAdvancedPropertySettingData objects found by CIM query for
    instances of the ROOT/StandardCimv2/MSFT_NetAdapterAdvancedPropertySettingData
    class on the  CIM server: SELECT * FROM
    MSFT_NetAdapterAdvancedPropertySettingData  WHERE ((Name LIKE '%')) AND
    ((DisplayName LIKE 'Virtual Machine Queues')). Verify query parameters and
    retry.
    At line:1 char:1
    + Get-NetAdapterAdvancedProperty * -DisplayName "Virtual Machine Queues"
    + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        + CategoryInfo          : ObjectNotFound: (MSFT_NetAdapter...ertySettingDa
       ta:String) [Get-NetAdapterAdvancedProperty], CimJobException
        + FullyQualifiedErrorId : CmdletizationQuery_NotFound,Get-NetAdapterAdvanc
       edProperty
    
    PS C:\>
    

    Yeah you seem to be fine there. No VMQ.
    You could still try updating the firmware to see if that resolves the issue before replacing the NIC.



  • @mike-davis said in hyper-v bad physical NIC? - vswitch or NIC teaming?:

    Given that, would it make the most sense to create a new vSwitch with the second NIC and add that to the VM and see if it stays up like that, or would NIC teaming be a better way to go?

    I wouldn't include a known bad NIC in a team. If that NIC is bad, disable it on the host if you can't switch it out. Create a new vSwitch from a working NIC and add that to the VM. You can use the same MAC if needed.



  • @tim_g said in hyper-v bad physical NIC? - vswitch or NIC teaming?:

    @mike-davis said in hyper-v bad physical NIC? - vswitch or NIC teaming?:

    @tim_g Thanks for the commands. If I get the output below, does that mean my NICs don't have the option for VMQ?

    Yeah you seem to be fine there. No VMQ.
    You could still try updating the firmware to see if that resolves the issue before replacing the NIC.

    When I ran the commands in the output above that was on a spare non production server. I ran those commands last night on the problem server and it did have VMQ and it was enabled. I disabled it and it has been up 17 hours now. Before when it was failing, it would go down every 3-5 days, so I'll have to wait a few more days to know if the problem has been licked.



  • You can also try:

    get-netadaptervmq

    If you want to do a blanket disable of VMQ (which is what I usually do):

    get-netadaptervmq|disable-netadaptervmq

    What does the windows event log look like? (You said this was a gui install, right?)



  • @dafyre said in hyper-v bad physical NIC? - vswitch or NIC teaming?:

    What does the windows event log look like? (You said this was a gui install, right?)

    My lab server is a GUI install, but the production server is non GUI.


Log in to reply