@DustinB3403 said in A Public Post Mortem of An Outage:

Wow, that is a rather long time.

Yup, parts were very hard to get and getting the server physically moved before diagnostics could begin ate huge amounts of time up. Cost of speeding things up would have been huge - replacing gear instead of repairing it. But since the vendor could not diagnose the issue with the hardware (their error messages were ones that they did not have documented) it complicated things greatly.