RAID 5 URE Clarity Question
-
@tim_g said in RAID 5 URE Clarity Question:
@scottalanmiller said in RAID 5 URE Clarity Question:
@tim_g said in RAID 5 URE Clarity Question:
@scottalanmiller said in RAID 5 URE Clarity Question:
So it is 2TB, from every working drive in the array (4),
to avoid confusion, do you mean (5), for 10TB total? Because there's 6 total, one went bad, 5 working ones left?
No, because URE risk only matters when two drives are lost in RAID 6. If you had five drives, you have no URE risk.
I'm talking about a 6x 2TB drives in a RAID 5. One of those drives goes bad, so you hot-swap it out with a good one and the rebuilding starts.
I'm not asking or saying anything at all about RAID 6.
Whoops.
In that case you need 500% of a single drive. So the failure domain is 10TB, not 8TB. Sorry, got confused. You need the full capacity of all five remaining drives to restore the one that has been lost.
-
@scottalanmiller said in RAID 5 URE Clarity Question:
@tim_g said in RAID 5 URE Clarity Question:
@scottalanmiller said in RAID 5 URE Clarity Question:
@tim_g said in RAID 5 URE Clarity Question:
@scottalanmiller said in RAID 5 URE Clarity Question:
So it is 2TB, from every working drive in the array (4),
to avoid confusion, do you mean (5), for 10TB total? Because there's 6 total, one went bad, 5 working ones left?
No, because URE risk only matters when two drives are lost in RAID 6. If you had five drives, you have no URE risk.
I'm talking about a 6x 2TB drives in a RAID 5. One of those drives goes bad, so you hot-swap it out with a good one and the rebuilding starts.
I'm not asking or saying anything at all about RAID 6.
Whoops.
In that case you need 500% of a single drive. So the failure domain is 10TB, not 8TB. Sorry, got confused. You need the full capacity of all five remaining drives to restore the one that has been lost.
Okay, that's what I thought and wanted to make sure or i'd be confused again.
-
Sorry about the RAID 6 confusion. Everything referencing 8TB or 400% was me thinking this was six drives in RAID 6 and losing two, instead of six disks in RAID 5 losing one.
-
@scottalanmiller said in RAID 5 URE Clarity Question:
@tim_g said in RAID 5 URE Clarity Question:
@scottalanmiller said in RAID 5 URE Clarity Question:
@tim_g said in RAID 5 URE Clarity Question:
@scottalanmiller said in RAID 5 URE Clarity Question:
@tim_g said in RAID 5 URE Clarity Question:
So isn't drive D only needed for the 400GB it contains of drive E to help rebuild it?
No, D doesn't contain ANYTHING of drive E. That's likely the root of confusion. At no point in parity RAID does any drive contain the contents of any other drive. That's mirroring, and mirroring doesn't have this risk at all.
That's not how I mean it... it contains 400GB of parity data that is used to help reconstruct the data in drive E, doesn't it?
No, it contains 2TB of parity data, every block of which is necessary for reconstructing the lost drive(s).
Oh I see... I had it wrong the whole time.
I figured that out So it is 2TB, from every working drive in the array (4), for 8TB total. Which gives us somewhere around a 60% chance of hitting a URE. That's because 12T is an average, not a guarantee. If it was exactly every 12TB, it would be 67% chance of loss.
Okay yes. But a URE happens on a single drive. And the rate of a URE happening on a single drive is 10^14. 2TB of reads is only 16.6% of 12TB. So I still don't see where you get your 60-67% chance from.
-
@tim_g said in RAID 5 URE Clarity Question:
@scottalanmiller said in RAID 5 URE Clarity Question:
@tim_g said in RAID 5 URE Clarity Question:
@scottalanmiller said in RAID 5 URE Clarity Question:
@tim_g said in RAID 5 URE Clarity Question:
@scottalanmiller said in RAID 5 URE Clarity Question:
@tim_g said in RAID 5 URE Clarity Question:
So isn't drive D only needed for the 400GB it contains of drive E to help rebuild it?
No, D doesn't contain ANYTHING of drive E. That's likely the root of confusion. At no point in parity RAID does any drive contain the contents of any other drive. That's mirroring, and mirroring doesn't have this risk at all.
That's not how I mean it... it contains 400GB of parity data that is used to help reconstruct the data in drive E, doesn't it?
No, it contains 2TB of parity data, every block of which is necessary for reconstructing the lost drive(s).
Oh I see... I had it wrong the whole time.
I figured that out So it is 2TB, from every working drive in the array (4), for 8TB total. Which gives us somewhere around a 60% chance of hitting a URE. That's because 12T is an average, not a guarantee. If it was exactly every 12TB, it would be 67% chance of loss.
Okay yes. But a URE happens on a single drive. And the rate of a URE happening on a single drive is 10^14. 2TB of reads is only 16.6% of 12TB. So I still don't see where you get your 60-67% chance from.
Or I mean 2TB is only 20% of 10TB... so not seeing the 60-67% you come up with.
-
@tim_g said in RAID 5 URE Clarity Question:
@tim_g said in RAID 5 URE Clarity Question:
@scottalanmiller said in RAID 5 URE Clarity Question:
@tim_g said in RAID 5 URE Clarity Question:
@scottalanmiller said in RAID 5 URE Clarity Question:
@tim_g said in RAID 5 URE Clarity Question:
@scottalanmiller said in RAID 5 URE Clarity Question:
@tim_g said in RAID 5 URE Clarity Question:
So isn't drive D only needed for the 400GB it contains of drive E to help rebuild it?
No, D doesn't contain ANYTHING of drive E. That's likely the root of confusion. At no point in parity RAID does any drive contain the contents of any other drive. That's mirroring, and mirroring doesn't have this risk at all.
That's not how I mean it... it contains 400GB of parity data that is used to help reconstruct the data in drive E, doesn't it?
No, it contains 2TB of parity data, every block of which is necessary for reconstructing the lost drive(s).
Oh I see... I had it wrong the whole time.
I figured that out So it is 2TB, from every working drive in the array (4), for 8TB total. Which gives us somewhere around a 60% chance of hitting a URE. That's because 12T is an average, not a guarantee. If it was exactly every 12TB, it would be 67% chance of loss.
Okay yes. But a URE happens on a single drive. And the rate of a URE happening on a single drive is 10^14. 2TB of reads is only 16.6% of 12TB. So I still don't see where you get your 60-67% chance from.
Or I mean 2TB is only 20% of 10TB... so not seeing the 60-67% you come up with.
Because it's cumulative. It's X% per drive you're reading from.
Going back to the 6 TB of data in the 6 disk array, one drive dies, the remaining 5 drives each have a 10%, so youhave 5 * 10% = 50% total chance.
-
@tim_g said in RAID 5 URE Clarity Question:
@scottalanmiller said in RAID 5 URE Clarity Question:
@tim_g said in RAID 5 URE Clarity Question:
@scottalanmiller said in RAID 5 URE Clarity Question:
@tim_g said in RAID 5 URE Clarity Question:
@scottalanmiller said in RAID 5 URE Clarity Question:
@tim_g said in RAID 5 URE Clarity Question:
So isn't drive D only needed for the 400GB it contains of drive E to help rebuild it?
No, D doesn't contain ANYTHING of drive E. That's likely the root of confusion. At no point in parity RAID does any drive contain the contents of any other drive. That's mirroring, and mirroring doesn't have this risk at all.
That's not how I mean it... it contains 400GB of parity data that is used to help reconstruct the data in drive E, doesn't it?
No, it contains 2TB of parity data, every block of which is necessary for reconstructing the lost drive(s).
Oh I see... I had it wrong the whole time.
I figured that out So it is 2TB, from every working drive in the array (4), for 8TB total. Which gives us somewhere around a 60% chance of hitting a URE. That's because 12T is an average, not a guarantee. If it was exactly every 12TB, it would be 67% chance of loss.
Okay yes. But a URE happens on a single drive. And the rate of a URE happening on a single drive is 10^14. 2TB of reads is only 16.6% of 12TB. So I still don't see where you get your 60-67% chance from.
Why do you keep mentioning single drives when the risk is all the drives together? Yes, the actual URE might occur on any one of the drives, but each has an equal risk in any give operation. So the risk domain is 10TB, or 60%. I can't figure out why you keep mentioning a single drive and tying that to the risk domain, there are more than one drive here, all of them are 100% necessary.
-
@tim_g said in RAID 5 URE Clarity Question:
@tim_g said in RAID 5 URE Clarity Question:
@scottalanmiller said in RAID 5 URE Clarity Question:
@tim_g said in RAID 5 URE Clarity Question:
@scottalanmiller said in RAID 5 URE Clarity Question:
@tim_g said in RAID 5 URE Clarity Question:
@scottalanmiller said in RAID 5 URE Clarity Question:
@tim_g said in RAID 5 URE Clarity Question:
So isn't drive D only needed for the 400GB it contains of drive E to help rebuild it?
No, D doesn't contain ANYTHING of drive E. That's likely the root of confusion. At no point in parity RAID does any drive contain the contents of any other drive. That's mirroring, and mirroring doesn't have this risk at all.
That's not how I mean it... it contains 400GB of parity data that is used to help reconstruct the data in drive E, doesn't it?
No, it contains 2TB of parity data, every block of which is necessary for reconstructing the lost drive(s).
Oh I see... I had it wrong the whole time.
I figured that out So it is 2TB, from every working drive in the array (4), for 8TB total. Which gives us somewhere around a 60% chance of hitting a URE. That's because 12T is an average, not a guarantee. If it was exactly every 12TB, it would be 67% chance of loss.
Okay yes. But a URE happens on a single drive. And the rate of a URE happening on a single drive is 10^14. 2TB of reads is only 16.6% of 12TB. So I still don't see where you get your 60-67% chance from.
Or I mean 2TB is only 20% of 10TB... so not seeing the 60-67% you come up with.
Right, and 20% x 5 is?
-
@dashrender said in RAID 5 URE Clarity Question:
@tim_g said in RAID 5 URE Clarity Question:
@tim_g said in RAID 5 URE Clarity Question:
@scottalanmiller said in RAID 5 URE Clarity Question:
@tim_g said in RAID 5 URE Clarity Question:
@scottalanmiller said in RAID 5 URE Clarity Question:
@tim_g said in RAID 5 URE Clarity Question:
@scottalanmiller said in RAID 5 URE Clarity Question:
@tim_g said in RAID 5 URE Clarity Question:
So isn't drive D only needed for the 400GB it contains of drive E to help rebuild it?
No, D doesn't contain ANYTHING of drive E. That's likely the root of confusion. At no point in parity RAID does any drive contain the contents of any other drive. That's mirroring, and mirroring doesn't have this risk at all.
That's not how I mean it... it contains 400GB of parity data that is used to help reconstruct the data in drive E, doesn't it?
No, it contains 2TB of parity data, every block of which is necessary for reconstructing the lost drive(s).
Oh I see... I had it wrong the whole time.
I figured that out So it is 2TB, from every working drive in the array (4), for 8TB total. Which gives us somewhere around a 60% chance of hitting a URE. That's because 12T is an average, not a guarantee. If it was exactly every 12TB, it would be 67% chance of loss.
Okay yes. But a URE happens on a single drive. And the rate of a URE happening on a single drive is 10^14. 2TB of reads is only 16.6% of 12TB. So I still don't see where you get your 60-67% chance from.
Or I mean 2TB is only 20% of 10TB... so not seeing the 60-67% you come up with.
Because it's cumulative. It's X% per drive you're reading from.
Going back to the 6 TB of data in the 6 disk array, one drive dies, the remaining 5 drives each have a 10%, so youhave 5 * 10% = 50% total chance.
That's not actually how the math works. You actually started from the correct 50% number, but each individual drive doesn't actually have a 10% chance. It's actually higher than 10% individually. Risk math is funny. The 50% / 6TB number is handy because it is the inflection point where you don't have to do fancy math.
The easy way to think of it is that even 1000TB doesn't come to 100% risk (but 99.99999%) and nothing ever hits 0%. But 50% is the magic "top of the bell curve" spot.
-
Think of it like dice. Let's say you have six dice, and one fails (lol). Now you have five dice left. You have to roll them all. If any of them rolls a 1, you lose. When you roll five dice, each with six sides, and any of them rolling a "1" causes total loss, what are the chances of hitting a 1 on that five dice roll? Pretty high. Not super high, no one would be shocked if you got lucky and didn't roll a single one, but no one would be surprised that you rolled one, either.
-
@scottalanmiller said in RAID 5 URE Clarity Question:
@tim_g said in RAID 5 URE Clarity Question:
@tim_g said in RAID 5 URE Clarity Question:
@scottalanmiller said in RAID 5 URE Clarity Question:
@tim_g said in RAID 5 URE Clarity Question:
@scottalanmiller said in RAID 5 URE Clarity Question:
@tim_g said in RAID 5 URE Clarity Question:
@scottalanmiller said in RAID 5 URE Clarity Question:
@tim_g said in RAID 5 URE Clarity Question:
So isn't drive D only needed for the 400GB it contains of drive E to help rebuild it?
No, D doesn't contain ANYTHING of drive E. That's likely the root of confusion. At no point in parity RAID does any drive contain the contents of any other drive. That's mirroring, and mirroring doesn't have this risk at all.
That's not how I mean it... it contains 400GB of parity data that is used to help reconstruct the data in drive E, doesn't it?
No, it contains 2TB of parity data, every block of which is necessary for reconstructing the lost drive(s).
Oh I see... I had it wrong the whole time.
I figured that out So it is 2TB, from every working drive in the array (4), for 8TB total. Which gives us somewhere around a 60% chance of hitting a URE. That's because 12T is an average, not a guarantee. If it was exactly every 12TB, it would be 67% chance of loss.
Okay yes. But a URE happens on a single drive. And the rate of a URE happening on a single drive is 10^14. 2TB of reads is only 16.6% of 12TB. So I still don't see where you get your 60-67% chance from.
Or I mean 2TB is only 20% of 10TB... so not seeing the 60-67% you come up with.
Right, and 20% x 5 is?
I see. I didn't understand that it was accumulative of each individual drive's URE rate.
Thanks for helping me to clear everything up.
-
@tim_g
Instead of rolling six sided dice:
During a rebuild after failed drive in a 6 drive RAID 5 array you have 5 dice with 10^14 sides each. One of the sides on each of these dice has a skull and crossbones. You roll all 5 of these dice thousands of times a second until the rebuild is complete. Just one of those dice on any roll during the operation required to rebuild needs to stop on skull and crossbones. Skull and crossbones wont come up right away, but given enough rolls it will. The number of dice rolls to rebuild is a function of array size and rebuild rate. Rebuild rate is the same for the RAID 5 set in question(whatever it is, is the same whether the array is 1TB or 1000TB). Higher array size = more chance of skull and crossbones. The relationship isnt linear, but the longer you dont land on skull and crossbones the higher the chance you will in the future.Now i am going to go design a DnD ruleset with 10^14 sided dice.
-
@momurda said in RAID 5 URE Clarity Question:
Now i am going to go design a DnD ruleset with 10^14 sided dice.
That's just mean!
@Tim_G I think you begin to understand why URE is so misunderstood.
-
Yeah I do see.
I experienced 2 of them in almost a single week. Though they were just for testing hardware and didn't contain any real data so no losses... They were very old drives too so it was expected after forcing a rebuild.
One was a 5TB raw RAID 5 (5x 1tb drives), the other was 1tb something raw, a bunch of old 15k 300gb SAS.
Actually the 5TB one got a URE, the other one, a 2nd drive failed, not a URE.
-
@tim_g said in RAID 5 URE Clarity Question:
Yeah I do see.
I experienced 2 of them in almost a single week. Though they were just for testing hardware and didn't contain any real data so no losses... They were very old drives too so it was expected after forcing a rebuild.
One was a 5TB raw RAID 5 (5x 1tb drives), the other was 1tb something raw, a bunch of old 15k 300gb SAS.
Actually the 5TB one got a URE, the other one, a 2nd drive failed, not a URE.
Actually it is unknown if UREs go up over time. Likely they do, but the statistics are only average rate and don't state when or what variables contribute to higher or lower rates.