Did a routine power down of all our servers at the weekend, as the utility company were doing some maintenance on the electricity supply.
When I restarted the hosts, one of them failed to restart. Initially it hung at the ESXi boot screen on "loading power management". I did a forced reboot and it went past there but hung on "Loading module hpsa". I did another reboot and it hung at the same place ("Loading module hpsa"). I didn't get any HP boot errors. By this stage I was getting extremely worried.
I left it for an hour, and then did another reboot. This time it booted ok. Hallelujah.
Googling, I found one forum post where someone rebooted 5 times before it worked, but the thread ended without any solution or explanation of the cause.
I've read that there is a known problem with hpsa driver version 5.x.0.58-1. When I run:
esxcli software vib list | grep -i hpsa
I get:
scsi-hpsa 5.0.0-40OEM.500.0.0.472560
Not sure if this means my driver is ok. I'm reluctant to do anything in the short term that will require another reboot as I have zero confidence in this machine booting successfully.
This is a Proliant DL380 G6. We have two identical hosts, both with the same ESXi and hpsa version. The other host booted fine. There are no errors listed under Hardware Status in VSphere client.
I've looked through vmkernel.log to see if I can see anything obvious, but that only logs the last, successful, boot. Not the previous boots that hung. I have got the following error message, that is constantly being repeated:
WARNING: NFS: 221: Got error 2 from mount call
The host only uses local storage. However, in vSphere Client, under datastores, it is listing:
VeeamBackup_VEEAM1 (inactive) (unmounted).
What is this all about? The other hosts don't have this datastore and our Veeam server is powered on and has successfully backed up the host, so I'm not seeing any operational issues. I don't think this problem is related to my boot problems, but I'd like to get it cleaned up anyway.
I'm stuck on how to proceed. Any advice, please?