Windows Server 2003 Cluster Dead
-
Obviously once the Q disk was missing, it could not join the cluster.
-
Doing a controller power cycle now. Bringing down the physical cluster now, then the DAS. Then going to power on the DAS, give it time, and bring the nodes up. Expect very little, but it is a place to start.
-
DAS is powering up, lights on it do not look good.
-
10 drives in the array, believed to be RAID 10. 2 drives in RAID 1 as well.
-
One drive in the large array is flashing orange, so looks like one drive has failed.
All other drives are green.
-
Bringing up Node 1 again now. With only one drive failed in the DAS unit, any RAID (other than RAID 0) should have survived.
-
Okay, that process brought things up. Not the cluster, but the disks are back. We can see the Quorum plus other disks now.
-
Trying to bring up Node 2 now, but I'm not hopeful on that.
-
Node 1 is healthy, Node 2 is gone. Cluster won't come up, but the workloads did. So they are good for now.
-
Do they have a plan to replace this outdated tech with something current?
-
@Danp said in Windows Server 2003 Cluster Dead:
Do they have a plan to replace this outdated tech with something current?
Yes, there was a six month plan in place already, but it just got moved to something like a six day plan.
-
@scottalanmiller said in Windows Server 2003 Cluster Dead:
@Danp said in Windows Server 2003 Cluster Dead:
Do they have a plan to replace this outdated tech with something current?
Yes, there was a six month plan in place already, but it just got moved to something like a six day plan.
Love it when that happens!!
-
@FATeknollogee said in Windows Server 2003 Cluster Dead:
@scottalanmiller said in Windows Server 2003 Cluster Dead:
@Danp said in Windows Server 2003 Cluster Dead:
Do they have a plan to replace this outdated tech with something current?
Yes, there was a six month plan in place already, but it just got moved to something like a six day plan.
Love it when that happens!!
The system is still f***** because they have to replace it today and they have to worry about good backups today.
2003 is ancient
-
I'm guessing the thing hasn't been maintained at all which would have brought this about sooner but in a controlled manner.
-
@DustinB3403 said in Windows Server 2003 Cluster Dead:
@FATeknollogee said in Windows Server 2003 Cluster Dead:
@scottalanmiller said in Windows Server 2003 Cluster Dead:
@Danp said in Windows Server 2003 Cluster Dead:
Do they have a plan to replace this outdated tech with something current?
Yes, there was a six month plan in place already, but it just got moved to something like a six day plan.
Love it when that happens!!
The system is still f***** because they have to replace it today and they have to worry about good backups today.
2003 is ancient
Backups are running today.
-
@Obsolesce said in Windows Server 2003 Cluster Dead:
I'm guessing the thing hasn't been maintained at all which would have brought this about sooner but in a controlled manner.
Pretty much. We weren't even told about it. Not that we needed to be, we consult for this customer, we aren't their outsourced IT.
-
well, that's a Fuster Cluck and a half
-
And that's why we call them IPOD. (Inverted Pyramid of Doom). Welcome them to the club, hopefully doing it correctly this time!
-
@travisdh1 said in Windows Server 2003 Cluster Dead:
And that's why we call them IPOD. (Inverted Pyramid of Doom). Welcome them to the club, hopefully doing it correctly this time!
Yeah, we mentioned that on the call. But it predated the people who were there now (it even predated their CAREERS!) It's such an old system. When a system is 16 years old, it's actually not that common to find people who were actively working in IT at that time. If you assume most people don't start IT until the age that they would have finished college, that's 23. Add sixteen career years, that's 39. Add a year for planning of the project before it was purchased, and you are age 40. So only people likely to be 40+, who started in IT right away and didn't move from another career, could be reasonably expected to have been in the field at the time that the system was decided on! That's nuts.
-
This really is a good example of why the IPOD is so bad. The "never fails" DAS failed, but at least it didn't lose the data, it just caused a large panic outage.
But there are three servers, instead of one. And two of them failed. One completely (node 2), and one partially (the DAS.) Had only Node 1 been purchased, they would have had no outage, no failures, and made it sixteen years at about one quarter the cost, and never seen an outage at all.
The "just buy one server" here would have kicked the crap out of the reliability of the IPOD! No redundancy on this system was ever used, but because it had that redundancy, it caused things to fail that should not have.