Dell MD1220 RAID 5 Rebuild Question
-
@Jimmy9008 said in Dell MD1220 Rebuild Question:
I would expect that if the rebuild is successful, I would see a full drive come online. But available with 0 space. What Dell are saying is that as its full, it just wont work...
At the RAID level, empty or full is identical. RAID, by definition, can't tell what is used on top of it. Whoever you talked to either knows nothing about RAID, or hoped that you didn't and didn't want to actually admit that the unit wasn't working properly.
-
@travisdh1 said in Dell MD1220 Rebuild Question:
What they said is true, but they used the wrong terms and/or didn't follow completely through with the thought. It's all about weather a drive experiences a URE, in which case parity can not be calculated for that block. Most array controllers (apparently including the one in your MD1220) will just stop and not continue. Thus the failure 'due to parity'.
This is unrelated. Yes a URE is a risk. But a totally different aspect than what is being discussed.
-
@Jimmy9008 said in Dell MD1220 Rebuild Question:
So, if we do not hit URE the rebuild will go fine, and just show our full drive?
If it is truly RAID 5, and truly only one drive died, and you don't hit a URE, and the MD1220 actually works (it's not known for that) then you will recover just fine.
-
@Pete-S said in Dell MD1220 Rebuild Question:
@travisdh1 said in Dell MD1220 Rebuild Question:
@Jimmy9008 said in Dell MD1220 Rebuild Question:
@travisdh1 said in Dell MD1220 Rebuild Question:
@Jimmy9008 said in Dell MD1220 Rebuild Question:
@travisdh1 said in Dell MD1220 Rebuild Question:
It's RAID 5 with spinning rust, failure to rebuild is very likely. Automatic rebuild just ups the anti by taking away any chance you have to run a backup before attempting a rebuild. Hope the QA Team doesn't care about the data on it, and the backups are good if the data is at all important.
I didnt really type my question out properly. Apologies. What i'm asking is if anybody has heard of full arrays never rebuilding 'due to parity' as Dell just told me... not specifically about backups/Raid5.
I would expect that if the rebuild is successful, I would see a full drive come online. But available with 0 space. What Dell are saying is that as its full, it just wont work...
What they said is true, but they used the wrong terms and/or didn't follow completely through with the thought. It's all about weather a drive experiences a URE, in which case parity can not be calculated for that block. Most array controllers (apparently including the one in your MD1220) will just stop and not continue. Thus the failure 'due to parity'.
I'm guessing the person was just reading from a script and doesn't actually know much.
Shiboleet anyone? https://xkcd.com/806/
So, if we do not hit URE the rebuild will go fine, and just show our full drive?
So long as no other failures happen, yes. That's still a big gamble because of how much stress the drives are under while rebuilding. Can you see what percentage it's at and how long it's been running the rebuild?
But why isn't the array available during rebuilding? Usually it should be.
Right, this is where something is wrong. The drive should be slower than usual (maybe a LOT slower), but should keep on working just fine. If RAID has to go offline during a rebuild, it's a broken RAID system.
-
@Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:
@travisdh1 said in Dell MD1220 Rebuild Question:
@Jimmy9008 said in Dell MD1220 Rebuild Question:
@travisdh1 said in Dell MD1220 Rebuild Question:
@Jimmy9008 said in Dell MD1220 Rebuild Question:
@travisdh1 said in Dell MD1220 Rebuild Question:
It's RAID 5 with spinning rust, failure to rebuild is very likely. Automatic rebuild just ups the anti by taking away any chance you have to run a backup before attempting a rebuild. Hope the QA Team doesn't care about the data on it, and the backups are good if the data is at all important.
I didnt really type my question out properly. Apologies. What i'm asking is if anybody has heard of full arrays never rebuilding 'due to parity' as Dell just told me... not specifically about backups/Raid5.
I would expect that if the rebuild is successful, I would see a full drive come online. But available with 0 space. What Dell are saying is that as its full, it just wont work...
What they said is true, but they used the wrong terms and/or didn't follow completely through with the thought. It's all about weather a drive experiences a URE, in which case parity can not be calculated for that block. Most array controllers (apparently including the one in your MD1220) will just stop and not continue. Thus the failure 'due to parity'.
I'm guessing the person was just reading from a script and doesn't actually know much.
Shiboleet anyone? https://xkcd.com/806/
So, if we do not hit URE the rebuild will go fine, and just show our full drive?
So long as no other failures happen, yes. That's still a big gamble because of how much stress the drives are under while rebuilding. Can you see what percentage it's at and how long it's been running the rebuild?
I guess each of the 'pending' will be done in turn, whilst hoping for no URE?
You have many separate RAID arrays on one set of disks?
-
@scottalanmiller said in Dell MD1220 RAID 5 Rebuild Question:
@Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:
@travisdh1 said in Dell MD1220 Rebuild Question:
@Jimmy9008 said in Dell MD1220 Rebuild Question:
@travisdh1 said in Dell MD1220 Rebuild Question:
@Jimmy9008 said in Dell MD1220 Rebuild Question:
@travisdh1 said in Dell MD1220 Rebuild Question:
It's RAID 5 with spinning rust, failure to rebuild is very likely. Automatic rebuild just ups the anti by taking away any chance you have to run a backup before attempting a rebuild. Hope the QA Team doesn't care about the data on it, and the backups are good if the data is at all important.
I didnt really type my question out properly. Apologies. What i'm asking is if anybody has heard of full arrays never rebuilding 'due to parity' as Dell just told me... not specifically about backups/Raid5.
I would expect that if the rebuild is successful, I would see a full drive come online. But available with 0 space. What Dell are saying is that as its full, it just wont work...
What they said is true, but they used the wrong terms and/or didn't follow completely through with the thought. It's all about weather a drive experiences a URE, in which case parity can not be calculated for that block. Most array controllers (apparently including the one in your MD1220) will just stop and not continue. Thus the failure 'due to parity'.
I'm guessing the person was just reading from a script and doesn't actually know much.
Shiboleet anyone? https://xkcd.com/806/
So, if we do not hit URE the rebuild will go fine, and just show our full drive?
So long as no other failures happen, yes. That's still a big gamble because of how much stress the drives are under while rebuilding. Can you see what percentage it's at and how long it's been running the rebuild?
I guess each of the 'pending' will be done in turn, whilst hoping for no URE?
You have many separate RAID arrays on one set of disks?
New job I started a few weeks ago. Not sure what's been setup were or why yet.
No backups, lol. But that will get sorted after possibly rebuilding everything if the array fails.
Looks like some disks have been put in to a group as pooled resources for a cluster, but my first time looking at it. At least it's only QA.
-
@Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:
New job I started a few weeks ago. Not sure what's been setup were or why yet.
Been there done that.
I was at a job 2 months as their helpdesk guy. I was "promoted" when the prior admin left.
I barely had access to anything and the email server had RAID5 failure.
At the time, I was in the delivery room with my wife an hour before my first child was born.It was not a good time. That place was and still is a shit hole.
-
@JaredBusch said in Dell MD1220 RAID 5 Rebuild Question:
@Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:
New job I started a few weeks ago. Not sure what's been setup were or why yet.
Been there done that.
I was at a job 2 months as their helpdesk guy. I was "promoted" when the prior admin left.
I barely had access to anything and the email server had RAID5 failure.
At the time, I was in the delivery room with my wife an hour before my first child was born.It was not a good time. That place was and still is a shit hole.
Ouch, least this is only QA. Worst case I rebuilt the cluster and let the QA team build all their shit again.
They have cash available for sorting the mess out, that's fine. It's just that it's typical to go wrong!
Did you leave them?
-
@Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:
@JaredBusch said in Dell MD1220 RAID 5 Rebuild Question:
@Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:
New job I started a few weeks ago. Not sure what's been setup were or why yet.
Been there done that.
I was at a job 2 months as their helpdesk guy. I was "promoted" when the prior admin left.
I barely had access to anything and the email server had RAID5 failure.
At the time, I was in the delivery room with my wife an hour before my first child was born.It was not a good time. That place was and still is a shit hole.
Ouch, least this is only QA. Worst case I rebuilt the cluster and let the QA team build all their shit again.
They have cash available for sorting the mess out, that's fine. It's just that it's typical to go wrong!
Did you leave them?
Yeah, looks like people were just throwing money around, not that they did something horrible. Sucks now, but far from being a big deal.
-
Well, yeah... But no. Everything here is blades. Lol.
-
@Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:
Well, yeah... But no. Everything here is blades. Lol.
Oh, full on fail. Good thing that they brought you in!
-
The raid array rebuilt successfully. But the pools still won't come online. Do I need to run a force online?
-
@Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:
The raid array rebuilt successfully. But the pools still won't come online. Do I need to run a force online?
Perhaps. Worth a try. They should never have gone offline, though. Something else failed in this process. But what, who knows.
-
@scottalanmiller said in Dell MD1220 RAID 5 Rebuild Question:
@Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:
The raid array rebuilt successfully. But the pools still won't come online. Do I need to run a force online?
Perhaps. Worth a try. They should never have gone offline, though. Something else failed in this process. But what, who knows.
Yep, spot on.
I'm planning to move away from what they have pretty soon, next 2-3 months.
Looking at two or three Dell R940s, couple of TB of RAM in each box and 4 processors in each. Or possibly a Starwind solution, depending on price.
-
@Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:
Looking at two or three Dell R940s, couple of TB of RAM in each box and 4 processors in each. Or possibly a Starwind solution, depending on price.
What kind of workload do you run? R940 are awesome for Linux, terrible for Windows.
-
@scottalanmiller said in Dell MD1220 RAID 5 Rebuild Question:
@Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:
Looking at two or three Dell R940s, couple of TB of RAM in each box and 4 processors in each. Or possibly a Starwind solution, depending on price.
What kind of workload do you run? R940 are awesome for Linux, terrible for Windows.
Windows Server 2012 R2 to 2019, mostly. Around 500 VM. R940, why so bad?
-
@Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:
@scottalanmiller said in Dell MD1220 RAID 5 Rebuild Question:
@Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:
Looking at two or three Dell R940s, couple of TB of RAM in each box and 4 processors in each. Or possibly a Starwind solution, depending on price.
What kind of workload do you run? R940 are awesome for Linux, terrible for Windows.
Windows Server 2012 R2 to 2019, mostly. Around 500 VM. R940, why so bad?
Because of MASSIVE licensing penalties. I mean staggering. It's so big that Microsoft has essentially created the one and two CPU server market and 8-core CPU market all on their own. The Windows licensing is so expensive, and so useless at scale, that basically everyone buys more small servers rather than fewer big ones because it turns out to be cheaper while giving you more power.
Servers larger than 16 cores and two sockets are almost exclusively for the Linux market. There are exceptions, but only for enterprise shops who are trapped with massive vertical workloads that only run on Windows which is basically a huge failure in and of itself, so pretty rare.
In reality, though, this reflects Microsoft's understanding of their own market. Big workloads that need huge vertical scaling would be insane to exist on Windows in the first place. So they are simply punishing foolish behavior and making money on people doing things poorly.
-
@scottalanmiller said in Dell MD1220 RAID 5 Rebuild Question:
@Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:
@scottalanmiller said in Dell MD1220 RAID 5 Rebuild Question:
@Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:
Looking at two or three Dell R940s, couple of TB of RAM in each box and 4 processors in each. Or possibly a Starwind solution, depending on price.
What kind of workload do you run? R940 are awesome for Linux, terrible for Windows.
Windows Server 2012 R2 to 2019, mostly. Around 500 VM. R940, why so bad?
Because of MASSIVE licensing penalties. I mean staggering. It's so big that Microsoft has essentially created the one and two CPU server market and 8-core CPU market all on their own. The Windows licensing is so expensive, and so useless at scale, that basically everyone buys more small servers rather than fewer big ones because it turns out to be cheaper while giving you more power.
Servers larger than 16 cores and two sockets are almost exclusively for the Linux market. There are exceptions, but only for enterprise shops who are trapped with massive vertical workloads that only run on Windows which is basically a huge failure in and of itself, so pretty rare.
In reality, though, this reflects Microsoft's understanding of their own market. Big workloads that need huge vertical scaling would be insane to exist on Windows in the first place. So they are simply punishing foolish behavior and making money on people doing things poorly.
I've been told not to worry about that side of things re licensing, otherwise totally agree. That's the licensing teams problem. I've pretty much need told (not that I understand or need to understand) that we are at a certain partnership level with Microsoft and they allow us to use whatever we want, as long as we stay at that level. So, they have said if you want 20 x Windows Server Data enter licenses or 50, that's fine.
That team could be wrong, but it's not my problem.
-
@Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:
@scottalanmiller said in Dell MD1220 RAID 5 Rebuild Question:
@Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:
@scottalanmiller said in Dell MD1220 RAID 5 Rebuild Question:
@Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:
Looking at two or three Dell R940s, couple of TB of RAM in each box and 4 processors in each. Or possibly a Starwind solution, depending on price.
What kind of workload do you run? R940 are awesome for Linux, terrible for Windows.
Windows Server 2012 R2 to 2019, mostly. Around 500 VM. R940, why so bad?
Because of MASSIVE licensing penalties. I mean staggering. It's so big that Microsoft has essentially created the one and two CPU server market and 8-core CPU market all on their own. The Windows licensing is so expensive, and so useless at scale, that basically everyone buys more small servers rather than fewer big ones because it turns out to be cheaper while giving you more power.
Servers larger than 16 cores and two sockets are almost exclusively for the Linux market. There are exceptions, but only for enterprise shops who are trapped with massive vertical workloads that only run on Windows which is basically a huge failure in and of itself, so pretty rare.
In reality, though, this reflects Microsoft's understanding of their own market. Big workloads that need huge vertical scaling would be insane to exist on Windows in the first place. So they are simply punishing foolish behavior and making money on people doing things poorly.
I've been told not to worry about that side of things re licensing, otherwise totally agree. That's the licensing teams problem. I've pretty much need told (not that I understand or need to understand) that we are at a certain partnership level with Microsoft and they allow us to use whatever we want, as long as we stay at that level. So, they have said if you want 20 x Windows Server Data enter licenses or 50, that's fine.
That team could be wrong, but it's not my problem.
That makes a little more sense. They've negotiated a deal with Microsoft from the sounds of it.
-
@travisdh1 said in Dell MD1220 RAID 5 Rebuild Question:
@Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:
@scottalanmiller said in Dell MD1220 RAID 5 Rebuild Question:
@Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:
@scottalanmiller said in Dell MD1220 RAID 5 Rebuild Question:
@Jimmy9008 said in Dell MD1220 RAID 5 Rebuild Question:
Looking at two or three Dell R940s, couple of TB of RAM in each box and 4 processors in each. Or possibly a Starwind solution, depending on price.
What kind of workload do you run? R940 are awesome for Linux, terrible for Windows.
Windows Server 2012 R2 to 2019, mostly. Around 500 VM. R940, why so bad?
Because of MASSIVE licensing penalties. I mean staggering. It's so big that Microsoft has essentially created the one and two CPU server market and 8-core CPU market all on their own. The Windows licensing is so expensive, and so useless at scale, that basically everyone buys more small servers rather than fewer big ones because it turns out to be cheaper while giving you more power.
Servers larger than 16 cores and two sockets are almost exclusively for the Linux market. There are exceptions, but only for enterprise shops who are trapped with massive vertical workloads that only run on Windows which is basically a huge failure in and of itself, so pretty rare.
In reality, though, this reflects Microsoft's understanding of their own market. Big workloads that need huge vertical scaling would be insane to exist on Windows in the first place. So they are simply punishing foolish behavior and making money on people doing things poorly.
I've been told not to worry about that side of things re licensing, otherwise totally agree. That's the licensing teams problem. I've pretty much need told (not that I understand or need to understand) that we are at a certain partnership level with Microsoft and they allow us to use whatever we want, as long as we stay at that level. So, they have said if you want 20 x Windows Server Data enter licenses or 50, that's fine.
That team could be wrong, but it's not my problem.
That makes a little more sense. They've negotiated a deal with Microsoft from the sounds of it.
Yeah, I'll never know that side as it'll be a specialised team here. Even going from 20 blades each with DC licensing down to 3 T940 would be less overall.
Any other reason than licensing?
I've reached out to Starwind too to see what they could do for us.