Windows Server 2003 Cluster Dead
-
Dealing with a Windows Server 2003 two node cluster with an HP StorageWorks 500 G2 DAS unit (SCSI attached). The cluster died this morning.
Node 1 is up and running, but cannot start the cluster. Cluster Manager opens and doesn't even list the cluster. We assume that because of this, the DAS (RAID via SCSI) shared storage is not mounted, since the tool to mount it never fires.
Node 2 is not up, it is down and doesn't even provide output to the console (blank screen) and cannot be pinged. So the assumption is, is that the hardware has died.
In theory the cluster's purpose was to fail over. But now it appears that the cluster itself has caused the outage. Anyone know how to get this fixed and up and running with the remaining node?
-
Looks like the power went out on the node last night, which might have been a trigger.
-
Yesterday morning, we got a SCSI error that the RAID didn't respond in time.
-
The power down happened about 18 hours after the SCSI error.
-
Last Event Log from Node 2 was 7 hours after the SCSI event, 11 hours before the Node 1 power cycle.
-
13 hours AFTER the power cycle, the event log reports that the Quorum disk "Q" cannot be found.
-
Obviously once the Q disk was missing, it could not join the cluster.
-
Doing a controller power cycle now. Bringing down the physical cluster now, then the DAS. Then going to power on the DAS, give it time, and bring the nodes up. Expect very little, but it is a place to start.
-
DAS is powering up, lights on it do not look good.
-
10 drives in the array, believed to be RAID 10. 2 drives in RAID 1 as well.
-
One drive in the large array is flashing orange, so looks like one drive has failed.
All other drives are green.
-
Bringing up Node 1 again now. With only one drive failed in the DAS unit, any RAID (other than RAID 0) should have survived.
-
Okay, that process brought things up. Not the cluster, but the disks are back. We can see the Quorum plus other disks now.
-
Trying to bring up Node 2 now, but I'm not hopeful on that.
-
Node 1 is healthy, Node 2 is gone. Cluster won't come up, but the workloads did. So they are good for now.
-
Do they have a plan to replace this outdated tech with something current?
-
@Danp said in Windows Server 2003 Cluster Dead:
Do they have a plan to replace this outdated tech with something current?
Yes, there was a six month plan in place already, but it just got moved to something like a six day plan.
-
@scottalanmiller said in Windows Server 2003 Cluster Dead:
@Danp said in Windows Server 2003 Cluster Dead:
Do they have a plan to replace this outdated tech with something current?
Yes, there was a six month plan in place already, but it just got moved to something like a six day plan.
Love it when that happens!!
-
@FATeknollogee said in Windows Server 2003 Cluster Dead:
@scottalanmiller said in Windows Server 2003 Cluster Dead:
@Danp said in Windows Server 2003 Cluster Dead:
Do they have a plan to replace this outdated tech with something current?
Yes, there was a six month plan in place already, but it just got moved to something like a six day plan.
Love it when that happens!!
The system is still f***** because they have to replace it today and they have to worry about good backups today.
2003 is ancient
-
I'm guessing the thing hasn't been maintained at all which would have brought this about sooner but in a controlled manner.