Odd VMWare issue



  • On Friday, one of our application servers (Optical) froze up. I rebooted the server which took a while, but everything came back up. The same thing happened again this morning and a third time this afternoon.

    I believe I know what is causing the issue. All 3 times, I was trying to either create or convert a VM template. Converting or Creating a VM usually takes 5-15 minutes to complete. On Friday, I let it run for about an hour and it was stuck slowly going from like 48%-53%. I finally gave up and canceled it. Mainly because our Optical Server went down. Once I canceled the template task in VM Optical started working fine again.

    The same thing happened again this morning. After I canceled the Template task. Optical started working again. I was able to recreate this a 3rd time to make sure the template jobs were causing the problem

    Troubleshooting:
    I tried both templates, both caused issues
    I also tried creating a VM from scratch, this also caused a hang-up.



  • I dont know enough to help(why i'm setting up a lab this weekend). I'd say dig through some logs and see what's causing the hang ups. sorry this is a vague stupid answer.



  • Can you see if there is an issue with IOPS or memory being exhausted?



  • @scottalanmiller It doesnt appear to be. Memory and processor usage on the Optical VM is very low. Also the Optical VM is on a different host and data store. It seems like it shouldn't be affected.



  • At that point in the creation process, it's writing to the storage and such.

    If you are using local storage, do you have the right storage drivers installed? If you are using shared storage, I would migrate the VM to another host and then create, see if that duplicates the issue.



  • @PSX_Defector said:

    At that point in the creation process, it's writing to the storage and such.

    If you are using local storage, do you have the right storage drivers installed? If you are using shared storage, I would migrate the VM to another host and then create, see if that duplicates the issue.

    Check the firmware on the drives you're using possibly?



  • @PSX_Defector We are using a NEW Equal Logic SAN. We had an MSP install it and create datastores to our VM Environment. Is there a step that might have been missed?

    It doesnt seem to be related to a particular host, The template, Optical VM, and the new VM I am trying to create are on 3 separate hosts.



  • @IRJ said:

    @PSX_Defector We are using a NEW Equal Logic SAN. We had an MSP install it and create datastores to our VM Environment. Is there a step that might have been missed?

    It doesnt seem to be related to a particular host, The template, Optical VM, and the new VM I am trying to create are on 3 separate hosts.

    Corrupt .ovf template?



  • @ajstringham No,I tried other templates and it also hangs up when building a new VM from scratch.



  • I agree with Scott, sounds like an IOPs issue.



  • What plug ins do you have installed? Check to confirm there are no issues with those. Have you tried this to different LUNS? I assume this is from Vcenter since you said you tried the template - does this happen from each Host as well or when installing to a particular host?



  • @IRJ said:

    @PSX_Defector We are using a NEW Equal Logic SAN. We had an MSP install it and create datastores to our VM Environment. Is there a step that might have been missed?

    Most likely.

    Usually Equal Logic means iSCSI, you could be running into collisions on the network or the SAN isn't able to keep up with the chatter.

    I would still migrate the VM over to another host, possibly the template as well. The host you are working with might be the problem, eliminate that. Otherwise I would get with Dell about performance on that thing. IOPS would explain this pretty well.



  • @PSX_Defector said:

    @IRJ said:

    @PSX_Defector We are using a NEW Equal Logic SAN. We had an MSP install it and create datastores to our VM Environment. Is there a step that might have been missed?

    Most likely.

    Usually Equal Logic means iSCSI, you could be running into collisions on the network or the SAN isn't able to keep up with the chatter.

    I would still migrate the VM over to another host, possibly the template as well. The host you are working with might be the problem, eliminate that. Otherwise I would get with Dell about performance on that thing. IOPS would explain this pretty well.

    Ok I finally got somewhere. When I loaded the VM template to the Host's local Storage. It worked like expected. There is a problem with the Equal Logic.

    Thanks everyone!



  • @IRJ If you have an Equalogic, SanHQ is an awesome tool when troubleshooting. You can check iops, latency, see if there are any TCP retransmits going on.



  • @sartre13776 said:

    @IRJ If you have an Equalogic, SanHQ is an awesome tool when troubleshooting. You can check iops, latency, see if there are any TCP retransmits going on.

    We paid a MSP to configure it. I created a ticket with them. Being able to prove it was the Equal Logic is huge, though.