Dell MD1220 RAID 5 Rebuild Question



  • Hi folks,

    We have a MD1220 DAS in Raid 5 (also out of warranty). This is used by our QA Team and they maxed out the space earlier... just in time for a drive to also fail!

    The storage is currently unavailable, which should not be the case as only one disk has failed. I can see rebuild is in progress, but the disk groups are 'failed'.

    Called Dell, they have said as the array was full, parity wont work and the rebuild will probably fail...

    Does that sound accurate to any of you?

    Best,
    Jim



  • It's RAID 5 with spinning rust, failure to rebuild is very likely. Automatic rebuild just ups the anti by taking away any chance you have to run a backup before attempting a rebuild. Hope the QA Team doesn't care about the data on it, and the backups are good if the data is at all important.



  • @travisdh1 said in Dell MD1220 Rebuild Question:

    It's RAID 5 with spinning rust, failure to rebuild is very likely. Automatic rebuild just ups the anti by taking away any chance you have to run a backup before attempting a rebuild. Hope the QA Team doesn't care about the data on it, and the backups are good if the data is at all important.

    I didnt really type my question out properly. Apologies. What i'm asking is if anybody has heard of full arrays never rebuilding 'due to parity' as Dell just told me... not specifically about backups/Raid5.

    I would expect that if the rebuild is successful, I would see a full drive come online. But available with 0 space. What Dell are saying is that as its full, it just wont work...



  • @Jimmy9008 said in Dell MD1220 Rebuild Question:

    @travisdh1 said in Dell MD1220 Rebuild Question:

    It's RAID 5 with spinning rust, failure to rebuild is very likely. Automatic rebuild just ups the anti by taking away any chance you have to run a backup before attempting a rebuild. Hope the QA Team doesn't care about the data on it, and the backups are good if the data is at all important.

    I didnt really type my question out properly. Apologies. What i'm asking is if anybody has heard of full arrays never rebuilding 'due to parity' as Dell just told me... not specifically about backups/Raid5.

    I would expect that if the rebuild is successful, I would see a full drive come online. But available with 0 space. What Dell are saying is that as its full, it just wont work...

    What they said is true, but they used the wrong terms and/or didn't follow completely through with the thought. It's all about weather a drive experiences a URE, in which case parity can not be calculated for that block. Most array controllers (apparently including the one in your MD1220) will just stop and not continue. Thus the failure 'due to parity'.

    I'm guessing the person was just reading from a script and doesn't actually know much.

    Shiboleet anyone? https://xkcd.com/806/



  • @travisdh1 said in Dell MD1220 Rebuild Question:

    @Jimmy9008 said in Dell MD1220 Rebuild Question:

    @travisdh1 said in Dell MD1220 Rebuild Question:

    It's RAID 5 with spinning rust, failure to rebuild is very likely. Automatic rebuild just ups the anti by taking away any chance you have to run a backup before attempting a rebuild. Hope the QA Team doesn't care about the data on it, and the backups are good if the data is at all important.

    I didnt really type my question out properly. Apologies. What i'm asking is if anybody has heard of full arrays never rebuilding 'due to parity' as Dell just told me... not specifically about backups/Raid5.

    I would expect that if the rebuild is successful, I would see a full drive come online. But available with 0 space. What Dell are saying is that as its full, it just wont work...

    What they said is true, but they used the wrong terms and/or didn't follow completely through with the thought. It's all about weather a drive experiences a URE, in which case parity can not be calculated for that block. Most array controllers (apparently including the one in your MD1220) will just stop and not continue. Thus the failure 'due to parity'.

    I'm guessing the person was just reading from a script and doesn't actually know much.

    Shiboleet anyone? https://xkcd.com/806/

    So, if we do not hit URE the rebuild will go fine, and just show our full drive?



  • @Jimmy9008 said in Dell MD1220 Rebuild Question:

    @travisdh1 said in Dell MD1220 Rebuild Question:

    @Jimmy9008 said in Dell MD1220 Rebuild Question:

    @travisdh1 said in Dell MD1220 Rebuild Question:

    It's RAID 5 with spinning rust, failure to rebuild is very likely. Automatic rebuild just ups the anti by taking away any chance you have to run a backup before attempting a rebuild. Hope the QA Team doesn't care about the data on it, and the backups are good if the data is at all important.

    I didnt really type my question out properly. Apologies. What i'm asking is if anybody has heard of full arrays never rebuilding 'due to parity' as Dell just told me... not specifically about backups/Raid5.

    I would expect that if the rebuild is successful, I would see a full drive come online. But available with 0 space. What Dell are saying is that as its full, it just wont work...

    What they said is true, but they used the wrong terms and/or didn't follow completely through with the thought. It's all about weather a drive experiences a URE, in which case parity can not be calculated for that block. Most array controllers (apparently including the one in your MD1220) will just stop and not continue. Thus the failure 'due to parity'.

    I'm guessing the person was just reading from a script and doesn't actually know much.

    Shiboleet anyone? https://xkcd.com/806/

    So, if we do not hit URE the rebuild will go fine, and just show our full drive?

    So long as no other failures happen, yes. That's still a big gamble because of how much stress the drives are under while rebuilding. Can you see what percentage it's at and how long it's been running the rebuild?



  • @travisdh1 said in Dell MD1220 Rebuild Question:

    @Jimmy9008 said in Dell MD1220 Rebuild Question:

    @travisdh1 said in Dell MD1220 Rebuild Question:

    @Jimmy9008 said in Dell MD1220 Rebuild Question:

    @travisdh1 said in Dell MD1220 Rebuild Question:

    It's RAID 5 with spinning rust, failure to rebuild is very likely. Automatic rebuild just ups the anti by taking away any chance you have to run a backup before attempting a rebuild. Hope the QA Team doesn't care about the data on it, and the backups are good if the data is at all important.

    I didnt really type my question out properly. Apologies. What i'm asking is if anybody has heard of full arrays never rebuilding 'due to parity' as Dell just told me... not specifically about backups/Raid5.

    I would expect that if the rebuild is successful, I would see a full drive come online. But available with 0 space. What Dell are saying is that as its full, it just wont work...

    What they said is true, but they used the wrong terms and/or didn't follow completely through with the thought. It's all about weather a drive experiences a URE, in which case parity can not be calculated for that block. Most array controllers (apparently including the one in your MD1220) will just stop and not continue. Thus the failure 'due to parity'.

    I'm guessing the person was just reading from a script and doesn't actually know much.

    Shiboleet anyone? https://xkcd.com/806/

    So, if we do not hit URE the rebuild will go fine, and just show our full drive?

    So long as no other failures happen, yes. That's still a big gamble because of how much stress the drives are under while rebuilding. Can you see what percentage it's at and how long it's been running the rebuild?

    But why isn't the array available during rebuilding? Usually it should be.



  • @travisdh1 said in Dell MD1220 Rebuild Question:

    @Jimmy9008 said in Dell MD1220 Rebuild Question:

    @travisdh1 said in Dell MD1220 Rebuild Question:

    @Jimmy9008 said in Dell MD1220 Rebuild Question:

    @travisdh1 said in Dell MD1220 Rebuild Question:

    It's RAID 5 with spinning rust, failure to rebuild is very likely. Automatic rebuild just ups the anti by taking away any chance you have to run a backup before attempting a rebuild. Hope the QA Team doesn't care about the data on it, and the backups are good if the data is at all important.

    I didnt really type my question out properly. Apologies. What i'm asking is if anybody has heard of full arrays never rebuilding 'due to parity' as Dell just told me... not specifically about backups/Raid5.

    I would expect that if the rebuild is successful, I would see a full drive come online. But available with 0 space. What Dell are saying is that as its full, it just wont work...

    What they said is true, but they used the wrong terms and/or didn't follow completely through with the thought. It's all about weather a drive experiences a URE, in which case parity can not be calculated for that block. Most array controllers (apparently including the one in your MD1220) will just stop and not continue. Thus the failure 'due to parity'.

    I'm guessing the person was just reading from a script and doesn't actually know much.

    Shiboleet anyone? https://xkcd.com/806/

    So, if we do not hit URE the rebuild will go fine, and just show our full drive?

    So long as no other failures happen, yes. That's still a big gamble because of how much stress the drives are under while rebuilding. Can you see what percentage it's at and how long it's been running the rebuild?

    Capture.PNG

    I guess each of the 'pending' will be done in turn, whilst hoping for no URE?



  • @Jimmy9008 said in Dell MD1220 Rebuild Question:

    @travisdh1 said in Dell MD1220 Rebuild Question:

    @Jimmy9008 said in Dell MD1220 Rebuild Question:

    @travisdh1 said in Dell MD1220 Rebuild Question:

    @Jimmy9008 said in Dell MD1220 Rebuild Question:

    @travisdh1 said in Dell MD1220 Rebuild Question:

    It's RAID 5 with spinning rust, failure to rebuild is very likely. Automatic rebuild just ups the anti by taking away any chance you have to run a backup before attempting a rebuild. Hope the QA Team doesn't care about the data on it, and the backups are good if the data is at all important.

    I didnt really type my question out properly. Apologies. What i'm asking is if anybody has heard of full arrays never rebuilding 'due to parity' as Dell just told me... not specifically about backups/Raid5.

    I would expect that if the rebuild is successful, I would see a full drive come online. But available with 0 space. What Dell are saying is that as its full, it just wont work...

    What they said is true, but they used the wrong terms and/or didn't follow completely through with the thought. It's all about weather a drive experiences a URE, in which case parity can not be calculated for that block. Most array controllers (apparently including the one in your MD1220) will just stop and not continue. Thus the failure 'due to parity'.

    I'm guessing the person was just reading from a script and doesn't actually know much.

    Shiboleet anyone? https://xkcd.com/806/

    So, if we do not hit URE the rebuild will go fine, and just show our full drive?

    So long as no other failures happen, yes. That's still a big gamble because of how much stress the drives are under while rebuilding. Can you see what percentage it's at and how long it's been running the rebuild?

    Capture.PNG

    I guess each of the 'pending' will be done in turn, whilst hoping for no URE?

    Essentially, yes.

    You also have to hope that the array being full doesn't throw another monkey wrench into the works.

    I'd be planning for failure now, while hopping for a successful rebuild.



  • @Jimmy9008 said in Dell MD1220 Rebuild Question:

    @travisdh1 said in Dell MD1220 Rebuild Question:

    @Jimmy9008 said in Dell MD1220 Rebuild Question:

    @travisdh1 said in Dell MD1220 Rebuild Question:

    @Jimmy9008 said in Dell MD1220 Rebuild Question:

    @travisdh1 said in Dell MD1220 Rebuild Question:

    It's RAID 5 with spinning rust, failure to rebuild is very likely. Automatic rebuild just ups the anti by taking away any chance you have to run a backup before attempting a rebuild. Hope the QA Team doesn't care about the data on it, and the backups are good if the data is at all important.

    I didnt really type my question out properly. Apologies. What i'm asking is if anybody has heard of full arrays never rebuilding 'due to parity' as Dell just told me... not specifically about backups/Raid5.

    I would expect that if the rebuild is successful, I would see a full drive come online. But available with 0 space. What Dell are saying is that as its full, it just wont work...

    What they said is true, but they used the wrong terms and/or didn't follow completely through with the thought. It's all about weather a drive experiences a URE, in which case parity can not be calculated for that block. Most array controllers (apparently including the one in your MD1220) will just stop and not continue. Thus the failure 'due to parity'.

    I'm guessing the person was just reading from a script and doesn't actually know much.

    Shiboleet anyone? https://xkcd.com/806/

    So, if we do not hit URE the rebuild will go fine, and just show our full drive?

    So long as no other failures happen, yes. That's still a big gamble because of how much stress the drives are under while rebuilding. Can you see what percentage it's at and how long it's been running the rebuild?

    Capture.PNG

    I guess each of the 'pending' will be done in turn, whilst hoping for no URE?

    How many quorum do you need? ( I know beside the point)



  • @Jimmy9008 said in Dell MD1220 Rebuild Question:

    Called Dell, they have said as the array was full, parity wont work and the rebuild will probably fail...

    Dell seems pretty confused about how RAID works in general. Lots of their stuff is mislabeled, like they don't even offer RAID 10, it is really RAID 100, but even Dell doesn't realize this. They aren't a group with a good understanding of storage in general.



  • @Jimmy9008 said in Dell MD1220 Rebuild Question:

    Does that sound accurate to any of you?

    Can't be accurate if the RAID 5 and "one drive failed" is true. Basically they are claiming that one of the three things is a falsehood. But which one?



  • @travisdh1 said in Dell MD1220 Rebuild Question:

    It's RAID 5 with spinning rust, failure to rebuild is very likely. Automatic rebuild just ups the anti by taking away any chance you have to run a backup before attempting a rebuild. Hope the QA Team doesn't care about the data on it, and the backups are good if the data is at all important.

    Yes, but that's not really part of the question at this point.



  • @Jimmy9008 said in Dell MD1220 Rebuild Question:

    I didnt really type my question out properly. Apologies. What i'm asking is if anybody has heard of full arrays never rebuilding 'due to parity' as Dell just told me... not specifically about backups/Raid5.

    No, it is a brazen lie. They figured that they could just get you off of the phone.



  • @Jimmy9008 said in Dell MD1220 Rebuild Question:

    I would expect that if the rebuild is successful, I would see a full drive come online. But available with 0 space. What Dell are saying is that as its full, it just wont work...

    At the RAID level, empty or full is identical. RAID, by definition, can't tell what is used on top of it. Whoever you talked to either knows nothing about RAID, or hoped that you didn't and didn't want to actually admit that the unit wasn't working properly.



  • @travisdh1 said in Dell MD1220 Rebuild Question:

    What they said is true, but they used the wrong terms and/or didn't follow completely through with the thought. It's all about weather a drive experiences a URE, in which case parity can not be calculated for that block. Most array controllers (apparently including the one in your MD1220) will just stop and not continue. Thus the failure 'due to parity'.

    This is unrelated. Yes a URE is a risk. But a totally different aspect than what is being discussed.



  • @Jimmy9008 said in Dell MD1220 Rebuild Question:

    So, if we do not hit URE the rebuild will go fine, and just show our full drive?

    If it is truly RAID 5, and truly only one drive died, and you don't hit a URE, and the MD1220 actually works (it's not known for that) then you will recover just fine.



  • @Pete-S said in Dell MD1220 Rebuild Question:

    @travisdh1 said in Dell MD1220 Rebuild Question:

    @Jimmy9008 said in Dell MD1220 Rebuild Question:

    @travisdh1 said in Dell MD1220 Rebuild Question:

    @Jimmy9008 said in Dell MD1220 Rebuild Question:

    @travisdh1 said in Dell MD1220 Rebuild Question:

    It's RAID 5 with spinning rust, failure to rebuild is very likely. Automatic rebuild just ups the anti by taking away any chance you have to run a backup before attempting a rebuild. Hope the QA Team doesn't care about the data on it, and the backups are good if the data is at all important.

    I didnt really type my question out properly. Apologies. What i'm asking is if anybody has heard of full arrays never rebuilding 'due to parity' as Dell just told me... not specifically about backups/Raid5.

    I would expect that if the rebuild is successful, I would see a full drive come online. But available with 0 space. What Dell are saying is that as its full, it just wont work...

    What they said is true, but they used the wrong terms and/or didn't follow completely through with the thought. It's all about weather a drive experiences a URE, in which case parity can not be calculated for that block. Most array controllers (apparently including the one in your MD1220) will just stop and not continue. Thus the failure 'due to parity'.

    I'm guessing the person was just reading from a script and doesn't actually know much.

    Shiboleet anyone? https://xkcd.com/806/

    So, if we do not hit URE the rebuild will go fine, and just show our full drive?

    So long as no other failures happen, yes. That's still a big gamble because of how much stress the drives are under while rebuilding. Can you see what percentage it's at and how long it's been running the rebuild?

    But why isn't the array available during rebuilding? Usually it should be.

    Right, this is where something is wrong. The drive should be slower than usual (maybe a LOT slower), but should keep on working just fine. If RAID has to go offline during a rebuild, it's a broken RAID system.



  • @Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:

    @travisdh1 said in Dell MD1220 Rebuild Question:

    @Jimmy9008 said in Dell MD1220 Rebuild Question:

    @travisdh1 said in Dell MD1220 Rebuild Question:

    @Jimmy9008 said in Dell MD1220 Rebuild Question:

    @travisdh1 said in Dell MD1220 Rebuild Question:

    It's RAID 5 with spinning rust, failure to rebuild is very likely. Automatic rebuild just ups the anti by taking away any chance you have to run a backup before attempting a rebuild. Hope the QA Team doesn't care about the data on it, and the backups are good if the data is at all important.

    I didnt really type my question out properly. Apologies. What i'm asking is if anybody has heard of full arrays never rebuilding 'due to parity' as Dell just told me... not specifically about backups/Raid5.

    I would expect that if the rebuild is successful, I would see a full drive come online. But available with 0 space. What Dell are saying is that as its full, it just wont work...

    What they said is true, but they used the wrong terms and/or didn't follow completely through with the thought. It's all about weather a drive experiences a URE, in which case parity can not be calculated for that block. Most array controllers (apparently including the one in your MD1220) will just stop and not continue. Thus the failure 'due to parity'.

    I'm guessing the person was just reading from a script and doesn't actually know much.

    Shiboleet anyone? https://xkcd.com/806/

    So, if we do not hit URE the rebuild will go fine, and just show our full drive?

    So long as no other failures happen, yes. That's still a big gamble because of how much stress the drives are under while rebuilding. Can you see what percentage it's at and how long it's been running the rebuild?

    Capture.PNG

    I guess each of the 'pending' will be done in turn, whilst hoping for no URE?

    You have many separate RAID arrays on one set of disks?



  • @scottalanmiller said in Dell MD1220 RAID 5 Rebuild Question:

    @Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:

    @travisdh1 said in Dell MD1220 Rebuild Question:

    @Jimmy9008 said in Dell MD1220 Rebuild Question:

    @travisdh1 said in Dell MD1220 Rebuild Question:

    @Jimmy9008 said in Dell MD1220 Rebuild Question:

    @travisdh1 said in Dell MD1220 Rebuild Question:

    It's RAID 5 with spinning rust, failure to rebuild is very likely. Automatic rebuild just ups the anti by taking away any chance you have to run a backup before attempting a rebuild. Hope the QA Team doesn't care about the data on it, and the backups are good if the data is at all important.

    I didnt really type my question out properly. Apologies. What i'm asking is if anybody has heard of full arrays never rebuilding 'due to parity' as Dell just told me... not specifically about backups/Raid5.

    I would expect that if the rebuild is successful, I would see a full drive come online. But available with 0 space. What Dell are saying is that as its full, it just wont work...

    What they said is true, but they used the wrong terms and/or didn't follow completely through with the thought. It's all about weather a drive experiences a URE, in which case parity can not be calculated for that block. Most array controllers (apparently including the one in your MD1220) will just stop and not continue. Thus the failure 'due to parity'.

    I'm guessing the person was just reading from a script and doesn't actually know much.

    Shiboleet anyone? https://xkcd.com/806/

    So, if we do not hit URE the rebuild will go fine, and just show our full drive?

    So long as no other failures happen, yes. That's still a big gamble because of how much stress the drives are under while rebuilding. Can you see what percentage it's at and how long it's been running the rebuild?

    Capture.PNG

    I guess each of the 'pending' will be done in turn, whilst hoping for no URE?

    You have many separate RAID arrays on one set of disks?

    New job I started a few weeks ago. Not sure what's been setup were or why yet.

    No backups, lol. But that will get sorted after possibly rebuilding everything if the array fails.

    Looks like some disks have been put in to a group as pooled resources for a cluster, but my first time looking at it. At least it's only QA.



  • @Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:

    New job I started a few weeks ago. Not sure what's been setup were or why yet.

    Been there done that.

    I was at a job 2 months as their helpdesk guy. I was "promoted" when the prior admin left.

    I barely had access to anything and the email server had RAID5 failure.
    At the time, I was in the delivery room with my wife an hour before my first child was born.

    It was not a good time. That place was and still is a shit hole.



  • @JaredBusch said in Dell MD1220 RAID 5 Rebuild Question:

    @Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:

    New job I started a few weeks ago. Not sure what's been setup were or why yet.

    Been there done that.

    I was at a job 2 months as their helpdesk guy. I was "promoted" when the prior admin left.

    I barely had access to anything and the email server had RAID5 failure.
    At the time, I was in the delivery room with my wife an hour before my first child was born.

    It was not a good time. That place was and still is a shit hole.

    Ouch, least this is only QA. Worst case I rebuilt the cluster and let the QA team build all their shit again.

    They have cash available for sorting the mess out, that's fine. It's just that it's typical to go wrong!

    Did you leave them?



  • @Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:

    @JaredBusch said in Dell MD1220 RAID 5 Rebuild Question:

    @Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:

    New job I started a few weeks ago. Not sure what's been setup were or why yet.

    Been there done that.

    I was at a job 2 months as their helpdesk guy. I was "promoted" when the prior admin left.

    I barely had access to anything and the email server had RAID5 failure.
    At the time, I was in the delivery room with my wife an hour before my first child was born.

    It was not a good time. That place was and still is a shit hole.

    Ouch, least this is only QA. Worst case I rebuilt the cluster and let the QA team build all their shit again.

    They have cash available for sorting the mess out, that's fine. It's just that it's typical to go wrong!

    Did you leave them?

    Yeah, looks like people were just throwing money around, not that they did something horrible. Sucks now, but far from being a big deal.



  • Well, yeah... But no. Everything here is blades. Lol.



  • @Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:

    Well, yeah... But no. Everything here is blades. Lol.

    Oh, full on fail. Good thing that they brought you in!



  • The raid array rebuilt successfully. But the pools still won't come online. Do I need to run a force online?



  • @Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:

    The raid array rebuilt successfully. But the pools still won't come online. Do I need to run a force online?

    Perhaps. Worth a try. They should never have gone offline, though. Something else failed in this process. But what, who knows.



  • @scottalanmiller said in Dell MD1220 RAID 5 Rebuild Question:

    @Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:

    The raid array rebuilt successfully. But the pools still won't come online. Do I need to run a force online?

    Perhaps. Worth a try. They should never have gone offline, though. Something else failed in this process. But what, who knows.

    Yep, spot on.

    I'm planning to move away from what they have pretty soon, next 2-3 months.

    Looking at two or three Dell R940s, couple of TB of RAM in each box and 4 processors in each. Or possibly a Starwind solution, depending on price.



  • @Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:

    Looking at two or three Dell R940s, couple of TB of RAM in each box and 4 processors in each. Or possibly a Starwind solution, depending on price.

    What kind of workload do you run? R940 are awesome for Linux, terrible for Windows.



  • @scottalanmiller said in Dell MD1220 RAID 5 Rebuild Question:

    @Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:

    Looking at two or three Dell R940s, couple of TB of RAM in each box and 4 processors in each. Or possibly a Starwind solution, depending on price.

    What kind of workload do you run? R940 are awesome for Linux, terrible for Windows.

    Windows Server 2012 R2 to 2019, mostly. Around 500 VM. R940, why so bad?