Planned power outage: best practice
-
Yeah, I understood the question. I was simply basking in the humor of the detailed explanation that, in reality, equaled "I did nothing at all" but made it sound like an action plan. I'm not really confident that there is a better plan, so I don't want to weigh in on that just yet.
-
Well, "do nothing" is my default plan for everything in life
-
Risks of the current plan obviously include the possibility of a power surge, or decreased power. I'm sure your UPS setup is ample protection against that, but that means you are counting on that to work properly; and that's never as safe as avoiding it altogether, which would be powering down before the outage and powering back up manually after power is restored.
-
@Carnival-Boy said:
Well, "do nothing" is my default plan for everything in life
Then hats off to you for making it sound impressive and making it look like effort was involved. That should be enough to throw management off the scent.
-
To be fair, I've spent considerable energy on thinking about this. Also, I'm still having to shutdown all the other VMs and putting the other hosts in maintenance mode, so I'm doing some work.
-
What UPS systems do you have. There are a number of them that will attach to either the ESXi host or to the vCenter VM and send the shutdown signal when it reaches a low power threshold.
-
Shutting down is the main problem. Will a Proliant server autostart when power is returned if the UPS has shut it down cleanly? I'm guessing not.
Shutting down is a bit of a problem, as I have no way of shutting down remotely as the firewall is a VM. But I also need to start up automatically.
-
@Carnival-Boy said:
Shutting down is the main problem. Will a Proliant server autostart when power is returned if the UPS has shut it down cleanly? I'm guessing not.
Shutting down is a bit of a problem, as I have no way of shutting down remotely as the firewall is a VM. But I also need to start up automatically.
Hmmm... we have an IBM server that does exactly that. Restarts as soon as power is restored. It has been tested fairly regularly as we don't have the best power infrastructure in upstate NY.
-
Why don't you have an out of band solution like DRAC or iLO? Keep necessary networking equipment on its own UPS, small ones could last for days, then remote in and fire up. I would buy one of those IP KVMs and have a workstation I could access at all times available. Then you can fix other problems as they come up.
-
We have a whole building UPS for the DC and offices. We have separate plugs for the offices that just computers and monitors plug into. And in the DC all the power is through a 20kva UPS. in the DC it runs for 45min and will shutdown automatically before that if we do not turn on the generator (normally we will not due to unneeded costs of running it.). The desktops users have to shutdown manually.
-
@PSX_Defector said:
Why don't you have an out of band solution like DRAC or iLO? Keep necessary networking equipment on its own UPS, small ones could last for days, then remote in and fire up. I would buy one of those IP KVMs and have a workstation I could access at all times available. Then you can fix other problems as they come up.
You mean do it properly instead of trying to get a away with a cheap and dirty solution? Yes, I probably need to do that. I won't have time to sort for this weekend though.
-
Because his firewall is a VM, once the server is down, he has no way to bring it back up. iDrac won't matter in this case. He's have to go back to an external firewall (external from that VM host).
-
@Carnival-Boy said:
Whilst this plan works, I'm concerned about the forced shutdown of ESXi and whether this gives a risk of corruption. But I have no other way of remotely restarting everything as the firewall is a VM.
A forced (e.g. power loss) power down always incurs risk. If you have flash or battery backed RAID controllers the risk is much lower but there is always risk that something will fail. As long as you have flash or a battery backed cache then the design is to survive a sudden loss of power. So since those are minimum specs for a business class server since the introduction of hardware RAID, you should be fine as long as your batteries are still healthy (flash backed effectively lasts forever, more or less.) The only major risk here, as long as you have that, is that you are allowing the system to depend on those for protection. Minor, but worth mentioning.
Just set the VMs to power themselves on - at least the one VM for the firewall. The others you can bring up manually.
-
Something else to keep in mind.... a UPS is designed to supply continuous, clean power under load. It is not designed to necessarily do so when failing - the idea being that a UPS is for temporary dips or loss of power and that either planned shutdown or generator power or full power restore happen before they fail. Failing a UPS means that you are doing battery damage to the UPS itself (this is lead acid still in this day and age) and potentially sending a surge or dip down the line to the unprotected servers at the time of failure. So even if your drives are safe you are stressing the servers potentially beyond what you might intend. Again, not major, but worth mentioning that the protection of the gear is being violated here so what might feel really safe may not be as safe as it feels.
No more dangerous than having no UPS and letting the power spikes, dips and drops from the main grid hit the servers.
-
@coliver said:
Hmmm... we have an IBM server that does exactly that. Restarts as soon as power is restored. It has been tested fairly regularly as we don't have the best power infrastructure in upstate NY.
In central NY that is. Here in Frankfort where I am today the power is outstanding and in the Buffalo/Niagara grid it is really good. But outside of local generation pockets, the center of the state is the area where it is roughest.