Replacing a Failed drive in MD RAID 10
-
Now there are a few guides that keep popping up in Google Search that give instructions on how to do this for RAID 1 MDADM Arrays.
And even @scottalanmiller has recommended the same above guide for RAID10 and this one on SW. But again RAID1.
So we'll have to work through it and ensure that they are still accurate.
-
@DustinB3403 Should be, mdadm still works the same way.
-
@travisdh1 said:
@DustinB3403 Should be, mdadm still works the same way.
Thanks, just being extra cautious to ensure this works smoothly.
To remove the disk from the array I should have to simply type
mdadm --manage /dev/md0 --fail /dev/sdc
and then
mdadm --manage /dev/md0 --remove /dev/sdc
At this point I should be able to shutdown the server, remove the disk and add it's replacement with
shutdown -h now
-
Obviously at this point there is some manual labor involved since I have no hot-swap capabilities. If your server has hot-swap you can just pull the drive at this point and add the replacement disk.
-
I'm at a stand-still as I wait for my replacement disk to arrive, so this project will have to get picked up in a day or so.
-
@DustinB3403 said:
@travisdh1 said:
@DustinB3403 Should be, mdadm still works the same way.
Thanks, just being extra cautious to ensure this works smoothly.
To remove the disk from the array I should have to simply type
mdadm --manage /dev/md0 --fail /dev/sdc
and then
mdadm --manage /dev/md0 --remove /dev/sdc
At this point I should be able to shutdown the server, remove the disk and add it's replacement with
shutdown -h now
Yep. After putting a replacement drive in, just add it back.
mdadm --manage /dev/md0 --add /dev/sd?
I like to keep an eye on the rebuild process with:
watch /cat/proc/mdstat
The array should be back to normal.
-
How did you figure out what drive it was in the array? Or did you pull them until you saw the one with that serial number?
-
@coliver said:
How did you figure out what drive it was in the array? Or did you pull them until you saw the one with that serial number?
How do I know which disk it is?
Well the other day I noticed that the array had a failed disk. Since I was rebuilding the system anyways I pulled each disk and performed a check disk from windows while checking for bad sectors.
Only 1 disk was found with bad sectors.
Knowing which disk this was, and windows saying it fixed the problem, I re-added the disk and simply "remember" which disk had the bad sectors.
So this disk is the disk that has to be removed.
-
@DustinB3403 said:
@coliver said:
How did you figure out what drive it was in the array? Or did you pull them until you saw the one with that serial number?
How do I know which disk it is?
Well the other day I noticed that the array had a failed disk. Since I was rebuilding the system anyways I pulled each disk and performed a check disk from windows while checking for bad sectors.
Only 1 disk was found with bad sectors.
Knowing which disk this was, and windows saying it fixed the problem, I re-added the disk and simply "remember" which disk had the bad sectors.
So this disk is the disk that has to be removed.
Ok, so you wouldn't be able to figure this out from the Linux CLI you would have to have a record of all the serial numbers that are in each bay.
-
@coliver Pretty much.
Since there is no hot-swap function on my server (no indicator lights either) it's simply a matter of my knowing which disk is connected to which SATA port.
-
So at this point I have the disk marked as failed, and removed from the array as shown below.
As you can see sdc is not a part of the array at the moment, which means nothing will be written to the disk. Obviously I'm in a dangerous point in time.
If I can't get my replacement disk soon, I risk losing the entire array.
Now, because I've ready had issues with this array (specifically the disk) I have nothing running on this system that I don't have several backups of. So the drive has been ordered and will be here in a day or so.
At which point I'll shutdown the server, remove the bad disk, and put the new one in.
-
While I wait for that drive to arrive, I'm going to figure out how to configure email alerts for the mdadm array. Seeing as this would be incredibly useful to have.
Since I can't sit here watching the cat /proc/mdstat....
-
@DustinB3403 said:
While I wait for that drive to arrive, I'm going to figure out how to configure email alerts for the mdadm array. Seeing as this would be incredibly useful to have.
Since I can't sit here watching the cat /proc/mdstat....
No remote ssh access?
-
@travisdh1 I do have access, but I'm still not going to sit here and watch it.
-
So now that I have the email alerts configured for my Xen Servers, I really want to work on updating SmartCTL so it supports the drives that I have in this server.
Which are pretty common drives.
Western Digital Red 1TB.
I'm really surprised how old of a database is built into XenServer 6.5.
So time to figure this part out.
-
@DustinB3403 said:
So now that I have the email alerts configured for my Xen Servers, I really want to work on updating SmartCTL so it supports the drives that I have in this server.
Which are pretty common drives.
Western Digital Red 1TD.
I'm really surprised how old of a database is built into XenServer 6.5.
So time to figure this part out.
WTF is a TD?
-
@JaredBusch said:
@DustinB3403 said:
So now that I have the email alerts configured for my Xen Servers, I really want to work on updating SmartCTL so it supports the drives that I have in this server.
Which are pretty common drives.
Western Digital Red 1TD.
I'm really surprised how old of a database is built into XenServer 6.5.
So time to figure this part out.
WTF is a TD?
That would be a typo' whoops.
1TB.
-
@JaredBusch said:
@DustinB3403 said:
So now that I have the email alerts configured for my Xen Servers, I really want to work on updating SmartCTL so it supports the drives that I have in this server.
Which are pretty common drives.
Western Digital Red 1TD.
I'm really surprised how old of a database is built into XenServer 6.5.
So time to figure this part out.
WTF is a TD?
TeraDactyl, duh.
It's the amount of storage taht canbe carried by an unladen teradactyl.
-
@scottalanmiller said:
@JaredBusch said:
@DustinB3403 said:
So now that I have the email alerts configured for my Xen Servers, I really want to work on updating SmartCTL so it supports the drives that I have in this server.
Which are pretty common drives.
Western Digital Red 1TD.
I'm really surprised how old of a database is built into XenServer 6.5.
So time to figure this part out.
WTF is a TD?
TeraDactyl, duh.
It's the amount of storage taht canbe carried by an unladen teradactyl.
A Jurassic or Triassic TeraDactyl?
-
@JaredBusch said:
@scottalanmiller said:
@JaredBusch said:
@DustinB3403 said:
So now that I have the email alerts configured for my Xen Servers, I really want to work on updating SmartCTL so it supports the drives that I have in this server.
Which are pretty common drives.
Western Digital Red 1TD.
I'm really surprised how old of a database is built into XenServer 6.5.
So time to figure this part out.
WTF is a TD?
TeraDactyl, duh.
It's the amount of storage taht canbe carried by an unladen teradactyl.
A Jurassic or Triassic TeraDactyl?
I... I don't know!