Responding to "This BS called URE" from Synology Forums
-
@scottalanmiller said in Responding to "This BS called URE" from Synology Forums:
Stop worrying about someone's misunderstanding of simple mathematics and just start using the devices that you have.
He ends with "don't worry about being IT professionals, don't bother to protect your data, just trust your vendors to magically provide protection that you didn't ask for, pay for, nor did they claim to give."
It's super weird to say to trust the devices here, when the device makers are the ones warning us of the risks!
-
In the second response, Charles Hooper references my paper on the same:
Roadkill401,
Is this the issue that you are describing?
http://www.smbitjournal.com/2012/05/when-no-redundancy-is-more-reliable/"What happens that scares us during a RAID 5 resilver operation is that an unrecoverable read error (URE) can occur. When it does the resilver operation halts and the array is left in a useless state – all data on the array is lost. On common SATA drives the rate of URE is 10^14, or once every twelve terabytes of read operations. That means that a six terabyte array being resilvered has a roughly fifty percent chance of hitting a URE and failing."
I have a degree in mathematics - but I have been focused on computer technology for roughly the last 20 years (so my mathematics skills are a bit rusty).
I believe that you are correct to a degree. In the above quoted example, there is not a roughly 50 percent chance of hitting a URE and having the array fail during the rebuild (resilver). Just as it is possible to roll a six sided die 10 times and never have the number six come up on top - it is a problem of probability, not straight addition and division. Also keep in mind that a drive's actual URE statistic does not remain constant through the life of the drive - the actual URE statistic decays as the drive ages.
Let's use a simple example that I have posted on the Synology forums before. Consider a four drive RAID 5 (SHR) array composed of 2TB drives. When one drive fails, that RAID 5 array has roughly 48,000,000,000,000 data bits that must be read successfully without a URE for the array to rebuild successfully when the failed drive is replaced. Using just the URE statistic provided by drive manufacturers, drives in this RAID 5 array with a one URE in 10^14 rating have a roughly 38.1% chance of failing to successfully rebuild when the failed drive is replaced. Here is the equation:
(1 - (99,999,999,999,999 / 100,000,000,000,000) ^ 48,000,000,000,000) = 0.380979164As you stated, drives are read a sector at a time, not a bit at a time. Most drives are now offered with 4KB sector sizes, rather than the older 512 byte sector size, so the drives with a one URE in 10^14 bit rating actually have a one URE in 3,051,757,813 4KB sector rating. In the same four drive RAID 5 array, there are roughly 1,464,843,750 4KB sectors in the non-failed drives. Again there is a roughly 38.1% chance of failing to successfully rebuild when the failed drive is replaced. Here is the equation:
(1 - (3,051,757,812 / 3,051,757,813) ^ 1,464,843,750 = 0.381216604For comparison, a four drive RAID 10 array composed of 2TB drives has a roughly 14.8% chance of failure during the rebuild:
(1 - (99,999,999,999,999 / 100,000,000,000,000) ^ 16,000,000,000,000) = 14.77%For arrays with a larger number of drives the difference between RAID 10 and RAID 5 (SHR) becomes even more significant because only a single other drive in a RAID 10 array must be fully read error free, while in a RAID 5 array all other drives in the array must be fully read error free.
-
@scottalanmiller said in Responding to "This BS called URE" from Synology Forums:
So this guy trying to claim he knows something about math and computers makes a thread on the Synology forums seven years ago and @Dashrender found it today and it seems like it never got any great responses and it is bad to leave this kind of misinformation out there and doing these analyses are always good, so let's break it down (since moving from SW to ML, the amount of "correcting dumbassery and people trying to mislead others has all but disappeared so we don't get to do this much.)
Why not reply to original post on synology. It doesn't make sense to address it here
-
@scottalanmiller said in Responding to "This BS called URE" from Synology Forums:
I believe that you are correct to a degree. In the above quoted example, there is not a roughly 50 percent chance of hitting a URE and having the array fail during the rebuild (resilver). Just as it is possible to roll a six sided die 10 times and never have the number six come up on top - it is a problem of probability, not straight addition and division.
So yes and no. Let's start with the die example. There is the statistics, and there is the "chance". There is a "chance" that you will roll a single die a billion times and never get a six. Yes. Obviously. We all know that. But statistically, you will get it pretty quickly.
Quick stats math...
(5/6)(5/6)(5/6)(5/6)(5/6)*(5/6) = 15625/ 46656 = .33 chance of NOT having it happen. so
1 - .33 = .66 or 66% chance of hitting the "die URE" of a six.
That's right, it's not lower than 50% in the dice example, it's higher, a lot higher. Yes, there is still a chance, a decent one, that you won't hit it. But the chances are if you roll a die ten times that you will get a six. Very good chances.
The "roughly 50%" number was based on statistics math. It just happens to be that at around the 50% chance mark additive numbers and statistical numbers are pretty close. They diverge as you leave the top of the bell in either direction, but they are pretty much on top of each other right in the middle. So because the writers here likely don't know statistical math, all they can do is see that additive math would have gotten us into the same ballpark.
-
@irj said in Responding to "This BS called URE" from Synology Forums:
@scottalanmiller said in Responding to "This BS called URE" from Synology Forums:
So this guy trying to claim he knows something about math and computers makes a thread on the Synology forums seven years ago and @Dashrender found it today and it seems like it never got any great responses and it is bad to leave this kind of misinformation out there and doing these analyses are always good, so let's break it down (since moving from SW to ML, the amount of "correcting dumbassery and people trying to mislead others has all but disappeared so we don't get to do this much.)
Why not reply to original post on synology. It doesn't make sense to address it here
I tried, they don't let you. Since they codified it in their archives, I wanted to make sure it was addressed somewhere, at least.
-
@irj said in Responding to "This BS called URE" from Synology Forums:
@scottalanmiller said in Responding to "This BS called URE" from Synology Forums:
So this guy trying to claim he knows something about math and computers makes a thread on the Synology forums seven years ago and @Dashrender found it today and it seems like it never got any great responses and it is bad to leave this kind of misinformation out there and doing these analyses are always good, so let's break it down (since moving from SW to ML, the amount of "correcting dumbassery and people trying to mislead others has all but disappeared so we don't get to do this much.)
Why not reply to original post on synology. It doesn't make sense to address it here
Trying again. I tried to put all this there but the "comment" buttons didn't do anything. Maybe they were having an issue. Let's see...
-
No luck, even when signed in the "comment" and "reply" fields appear to be disabled. Which makes sense, this is their legacy forum.
-
@scottalanmiller said in Responding to "This BS called URE" from Synology Forums:
As you stated, drives are read a sector at a time, not a bit at a time. Most drives are now offered with 4KB sector sizes, rather than the older 512 byte sector size, so the drives with a one URE in 10^14 bit rating actually have a one URE in 3,051,757,813 4KB sector rating. In the same four drive RAID 5 array, there are roughly 1,464,843,750 4KB sectors in the non-failed drives. Again there is a roughly 38.1% chance of failing to successfully rebuild when the failed drive is replaced. Here is the equation:
(1 - (3,051,757,812 / 3,051,757,813) ^ 1,464,843,750 = 0.381216604This is a little confusing. While bigger sectors does mean bigger potential failures, it does not change the failure rate overall because URE is measured in bit reads, not sector reads.
-
@scottalanmiller said in Responding to "This BS called URE" from Synology Forums:
@dashrender said in Responding to "This BS called URE" from Synology Forums:
@scottalanmiller said in Responding to "This BS called URE" from Synology Forums:
@scottalanmiller said in Responding to "This BS called URE" from Synology Forums:
Now apply this URE to your hard disk. They say that this magical 10^14 works out to 11.3 TB of information. So on a single 4tb hard drive, you would simply need to fill that drive and read the info back off it 4 times and it will give you an error.
If using single reset RR, yes. And tests of hard drives bears this out. So the math works in real world testing.
I want to confirm - you've seen situations where a 4 TB drive has been filled, then read back 4 times and it fails - regularly?
Absolutely, everyone has. It's so common even non-IT people are used to it.
I wasn't thinking.. of course, I probably have run into this on a single drive and didn't realize what the issue was - a single file failed.. not generally a huge deal.
-
@dashrender said in Responding to "This BS called URE" from Synology Forums:
@scottalanmiller said in Responding to "This BS called URE" from Synology Forums:
@dashrender said in Responding to "This BS called URE" from Synology Forums:
@scottalanmiller said in Responding to "This BS called URE" from Synology Forums:
@scottalanmiller said in Responding to "This BS called URE" from Synology Forums:
Now apply this URE to your hard disk. They say that this magical 10^14 works out to 11.3 TB of information. So on a single 4tb hard drive, you would simply need to fill that drive and read the info back off it 4 times and it will give you an error.
If using single reset RR, yes. And tests of hard drives bears this out. So the math works in real world testing.
I want to confirm - you've seen situations where a 4 TB drive has been filled, then read back 4 times and it fails - regularly?
Absolutely, everyone has. It's so common even non-IT people are used to it.
I wasn't thinking.. of course, I probably have run into this on a single drive and didn't realize what the issue was - a single file failed.. not generally a huge deal.
Right, for most people, it happens most often in the used portion of a drive. Humans rarely read an entire drive full of data. if we image a drive, most of the space is never read back until after it has been overwritten again.
When we do, most files could be corrupt and we wouldn't care. I get this regularly with my video games because they take up so much space (many TB) and Steam just redownloads the corrupt files semi-automatically to fix the issue. Only the save file really matters, and those are generally tiny.
Image and video files are what affect normal users the most and because it is normally just seen as an artefact in the video or a smudge on the image people don't really care.
When it is a system file, we can normally repair it as it isn't a unique file. For normal users, it is amazing how little UREs actually matter.
-
@scottalanmiller said in Responding to "This BS called URE" from Synology Forums:
For comparison, a four drive RAID 10 array composed of 2TB drives has a roughly 14.8% chance of failure during the rebuild:
(1 - (99,999,999,999,999 / 100,000,000,000,000) ^ 16,000,000,000,000) = 14.77%This is confusing and wrong. RAID 10 doesn't rebuild. Only the underlying RAID 1 does. And RAID 1 doesn't have parity.
So the risk of hitting the URE is only of a 2TB space, not a 6TB space. The size of the overall RAID 10 isn't relevant like it is with a parity array, so all that info is red herring. Also this assumes single mirror RAID 1, but with more we protect against URE. So we have to be specific. Not everyone with RAID 1 does only a single mirror (but most do, sure.)
Then there is the behaviour risk. Parity RAID is assumed to have to drop the array in case of an unprotected URE because it affects an unknown amount of data as the entire array is a "single corrupt file", but in a mirror, it can be sector copied as is and has the option to behave the same as a single drive does when there is a URE.
So nothing about the comparison is useful. The math shows the chances of the array hitting a protected URE during multiple reads, not an unprotected URE during a resilver like the RAID 5 exampe. Apples and oranges.
-
@scottalanmiller said in Responding to "This BS called URE" from Synology Forums:
@irj said in Responding to "This BS called URE" from Synology Forums:
@scottalanmiller said in Responding to "This BS called URE" from Synology Forums:
So this guy trying to claim he knows something about math and computers makes a thread on the Synology forums seven years ago and @Dashrender found it today and it seems like it never got any great responses and it is bad to leave this kind of misinformation out there and doing these analyses are always good, so let's break it down (since moving from SW to ML, the amount of "correcting dumbassery and people trying to mislead others has all but disappeared so we don't get to do this much.)
Why not reply to original post on synology. It doesn't make sense to address it here
Trying again. I tried to put all this there but the "comment" buttons didn't do anything. Maybe they were having an issue. Let's see...
Ah ok. I see
-
The Roadkill401 states that he doesn't even know what the issue is that we are discussing, which exposes why he's giving us this bad advice. What is weird is he gives the advice to do something crazy reckless, THEN admits he doesn't even know what the risk is we are discussing.
But here is where I have the problem..
According to you, we are simply doomed by your calculation in reading any disks you are going to get a read error eventually and sooner than you really would hope for. Just hope that the error is in something unimportant like a jpeg file rather than something very important like your tax return software.
Based on your method of calculation, after reading 6tb of data from a hard disk, you have about a 50/50 chance that the data that you have read so far has a bit error inside of it. (1/10^14)4.810^13
The issue then is not that you could get an error, but how the Synology deals with these bit errors. Regardless of doing any rebuild you are going to see a read error that will not be detected by the RAID as every read of data that you take off the drive is not verified against any CRC calculation to determine if what you are reading is accurate to what the drive thought was written to it.
The premise is that when doing a RAID rebuild, that the process will stop on the occurrence of one of these read errors that WILL happen at some point in time of the first 11.3TB of data read off any of the disks. But why would this happen? Does the disk itself know that the data it just read was faulty and give an error to the Synology? Isn't that really a MTBF ?? Or is it just that when doing the CRC calculation to try and rebuild the missing block, that the calculation will result in a value that just is not possible so it will fail? But that doesn't make any sense either as all you are doing is for example, reading a bit that should say 10110000 and getting 10010000. A single bit error that will give you the wrong result but why would it actually stop anything.So all you are really assured is that doing a rebuild, you are likely to get a bit error that will have a chance of changing some file at some point on the RAID disk. But the chances are about the same as you reading a file off the disk and getting a bit error and not knowing it, and then saving that now wrong file back to the disk.
I am perplexed then at what the issue really is ?
-
@scottalanmiller said in Responding to "This BS called URE" from Synology Forums:
The premise is that when doing a RAID rebuild, that the process will stop on the occurrence of one of these read errors that WILL happen at some point in time of the first 11.3TB of data read off any of the disks. But why would this happen? Does the disk itself know that the data it just read was faulty and give an error to the Synology? Isn't that really a MTBF ?? Or is it just that when doing the CRC calculation to try and rebuild the missing block, that the calculation will result in a value that just is not possible so it will fail? But that doesn't make any sense either as all you are doing is for example, reading a bit that should say 10110000 and getting 10010000. A single bit error that will give you the wrong result but why would it actually stop anything.
So all you are really assured is that doing a rebuild, you are likely to get a bit error that will have a chance of changing some file at some point on the RAID disk. But the chances are about the same as you reading a file off the disk and getting a bit error and not knowing it, and then saving that now wrong file back to the disk.
I am perplexed then at what the issue really is ?
So there is "premise" and "what the issue really is."
First, it is not a premise, it is how MD RAID, and all enterprise class RAID, works. In parity RAID we don't know what the impact is because the RAID system has no knowledge of the data on top of it and the array acts like it is a file (it is actually a volume, but the difference is the same.) When an exposed URE is encountered, whatever scale the layer is that is affected, is lost. In the case of mirrored RAID or no RAID, it is a sector. One bit is bad, the sector is scrapped. In the case of parity RAID, the minimum size above it is the volume which is mapped to the array. So the entire array is lost because it is a single unit that cannot be safely calculated. This is just parity basics, it's not a premise.
So what the issue really is.... is what has been stated ad nauseum and he is ignoring... that an exposed URE on a parity array being rebuilt causes the array to be in an unsafe state and dropped. It isn't that the array makers want to lose all of that data, it is just the granularity at which it can no longer be trusted.
-
Charler Hooper reports the right data in the next one. We aren't doomed, no one ever said we were. Just learn your storage basics, learn basic math, don't be emotional, pick the RAID level that provides the appropriate level of protection. Basically repeating every RAID thread that ever discussed this, but apparently RoadKill missed before posting a rant.
According to me, we are not doomed. Pick the correct RAID level (RAID 10 or RAID 6) and enterprise class hard drives when required. I suggest avoiding RAID 5 when the total array size will be larger than roughly 2TB, maybe less than that. Hopefully, my response is a bit more clear this time.
-
Roadkill401 follows up with this total nonesense...
Put the whole part of RAID and enterprise class drives to one side. Those can be dealt with later. The side of the URE is that EVERY DRIVE REGARDLESS will at some point in time experience a ure error.
Then you must consider your point:
In kernels prior to about 2.6.15, a read error would cause the same effect as a write error. In later kernels, a read-error will instead cause md to attempt a recovery by overwriting the bad block. i.e. it will find the correct data from elsewhere, write it over the block that failed, and then try to read it back again. If either the write or the re-read fail, md will treat the error the same way that a write error is treated, and will fail the whole device."
So in effect, the issue is not the drive that is causing the issue, but rather a defective programming of Linux. if the OS rather than simply flagging that file on the drive as being corrupt, would rather flag the whole drive, it comes across as a rather short sighted screwup.
But Linux is an OS that can be re-developed, so why don't they just fix the error to stop the carnage?? Sure, you can move to better enterprise drives if you want to minimize the chances of loosing a file or block of files effected by the loss of that sector, but the approach they seem to have taken is to simply kill the whole lot.
BTW: according to the current pdf from WD, the Red Pro is < 1 in 10^15 http://www.wdc.com/wdproducts/library/SpecSheet/ENG/2879-800022.pdf
-
@scottalanmiller said in Responding to "This BS called URE" from Synology Forums:
So in effect, the issue is not the drive that is causing the issue, but rather a defective programming of Linux. if the OS rather than simply flagging that file on the drive as being corrupt, would rather flag the whole drive, it comes across as a rather short sighted screwup.
Um, what? First of all, why is Linux the issue and not every RAID implementation ever made? Nice to attack Linux but ignore hardware RAID, third party RAID, Sun, Oracle, Windows, etc. But that's not even the bad part.
How would a RAID controller flag a file? The RAID level doesn't know even what LVM is put on top of it, let alone what filesystem, let alone what file, let alone have a way to talk to the OS! This really shows that conceptually, he's not even aware of hard drive basics. He things it is all magic.
At the RAID level, this is just a bunch of bits, all zeros and ones. They don't have any meaning, at all. In theory you could write a system with that level of vertical intelligence through the storage stack but, if you did, it would be insanely limited and would not make any sense to look anything like what we use for storage today and none of our normal storage processes would make sense any longer. It's not a horrible idea, but it is so many worlds removed from where we are here and if it had real world utilization on any scale, it is safe to assume that the big players would have implemented it.
So the RAID array has neither a mechanism to alert the layers above it of specific issues, nor does the RAID array know what file is bad. Except, it does....
So in parity RAID, there is a single "file" or virtual file on top of the disks... it's the parity disk itself. It's a single file, for all intents and purposes, that consists of the entirety of the array. And that is the "file" that the array has to flag as corrupt. To the RAID array it is truly failing only a single file. To the OS, it is an array / block device that is lost.
So even going by what Roadkill wants, he still gets exactly what we have already.
-
@scottalanmiller said in Responding to "This BS called URE" from Synology Forums:
if the OS rather than simply flagging that file on the drive as being corrupt, would rather flag the whole drive, it comes across as a rather short sighted screwup.
This is actually what it does. Except not the OS. The RAID controller (hardware or software) flags the file (array) as being corruption, not any drive. Any drive(s) with a URE are flagged as being healthy.
If you were to divide up the drives into many arrays, and you hit an exposed URE, only the single array (file) in which the URE was found would be corrupt. The drives, and other arrays (files) on them would be just as healthy as ever.
It only comes across as a short sighted screw up if you don't realize that the "fix" takes us right back to where we already are.