Responding to "This BS called URE" from Synology Forums

scottalanmiller

@dashrender said in Responding to "This BS called URE" from Synology Forums:

@scottalanmiller said in Responding to "This BS called URE" from Synology Forums:

@dashrender said in Responding to "This BS called URE" from Synology Forums:

@scottalanmiller said in Responding to "This BS called URE" from Synology Forums:

@scottalanmiller said in Responding to "This BS called URE" from Synology Forums:

Now apply this URE to your hard disk. They say that this magical 10^14 works out to 11.3 TB of information. So on a single 4tb hard drive, you would simply need to fill that drive and read the info back off it 4 times and it will give you an error.

If using single reset RR, yes. And tests of hard drives bears this out. So the math works in real world testing.

I want to confirm - you've seen situations where a 4 TB drive has been filled, then read back 4 times and it fails - regularly?

Absolutely, everyone has. It's so common even non-IT people are used to it.

I wasn't thinking.. of course, I probably have run into this on a single drive and didn't realize what the issue was - a single file failed.. not generally a huge deal.

Right, for most people, it happens most often in the used portion of a drive. Humans rarely read an entire drive full of data. if we image a drive, most of the space is never read back until after it has been overwritten again.

When we do, most files could be corrupt and we wouldn't care. I get this regularly with my video games because they take up so much space (many TB) and Steam just redownloads the corrupt files semi-automatically to fix the issue. Only the save file really matters, and those are generally tiny.

Image and video files are what affect normal users the most and because it is normally just seen as an artefact in the video or a smudge on the image people don't really care.

When it is a system file, we can normally repair it as it isn't a unique file. For normal users, it is amazing how little UREs actually matter.

scottalanmiller

@scottalanmiller said in Responding to "This BS called URE" from Synology Forums:

For comparison, a four drive RAID 10 array composed of 2TB drives has a roughly 14.8% chance of failure during the rebuild:
(1 - (99,999,999,999,999 / 100,000,000,000,000) ^ 16,000,000,000,000) = 14.77%

This is confusing and wrong. RAID 10 doesn't rebuild. Only the underlying RAID 1 does. And RAID 1 doesn't have parity.

So the risk of hitting the URE is only of a 2TB space, not a 6TB space. The size of the overall RAID 10 isn't relevant like it is with a parity array, so all that info is red herring. Also this assumes single mirror RAID 1, but with more we protect against URE. So we have to be specific. Not everyone with RAID 1 does only a single mirror (but most do, sure.)

Then there is the behaviour risk. Parity RAID is assumed to have to drop the array in case of an unprotected URE because it affects an unknown amount of data as the entire array is a "single corrupt file", but in a mirror, it can be sector copied as is and has the option to behave the same as a single drive does when there is a URE.

So nothing about the comparison is useful. The math shows the chances of the array hitting a protected URE during multiple reads, not an unprotected URE during a resilver like the RAID 5 exampe. Apples and oranges.

IRJ

@scottalanmiller said in Responding to "This BS called URE" from Synology Forums:

@irj said in Responding to "This BS called URE" from Synology Forums:

@scottalanmiller said in Responding to "This BS called URE" from Synology Forums:

So this guy trying to claim he knows something about math and computers makes a thread on the Synology forums seven years ago and @Dashrender found it today and it seems like it never got any great responses and it is bad to leave this kind of misinformation out there and doing these analyses are always good, so let's break it down (since moving from SW to ML, the amount of "correcting dumbassery and people trying to mislead others has all but disappeared so we don't get to do this much.)

https://community.synology.com/enu/forum/17/post/79816

Why not reply to original post on synology. It doesn't make sense to address it here

Trying again. I tried to put all this there but the "comment" buttons didn't do anything. Maybe they were having an issue. Let's see...

Ah ok. I see

scottalanmiller

The Roadkill401 states that he doesn't even know what the issue is that we are discussing, which exposes why he's giving us this bad advice. What is weird is he gives the advice to do something crazy reckless, THEN admits he doesn't even know what the risk is we are discussing.

But here is where I have the problem..

According to you, we are simply doomed by your calculation in reading any disks you are going to get a read error eventually and sooner than you really would hope for. Just hope that the error is in something unimportant like a jpeg file rather than something very important like your tax return software.

Based on your method of calculation, after reading 6tb of data from a hard disk, you have about a 50/50 chance that the data that you have read so far has a bit error inside of it. (1/10^14)4.810^13

The issue then is not that you could get an error, but how the Synology deals with these bit errors. Regardless of doing any rebuild you are going to see a read error that will not be detected by the RAID as every read of data that you take off the drive is not verified against any CRC calculation to determine if what you are reading is accurate to what the drive thought was written to it.
The premise is that when doing a RAID rebuild, that the process will stop on the occurrence of one of these read errors that WILL happen at some point in time of the first 11.3TB of data read off any of the disks. But why would this happen? Does the disk itself know that the data it just read was faulty and give an error to the Synology? Isn't that really a MTBF ?? Or is it just that when doing the CRC calculation to try and rebuild the missing block, that the calculation will result in a value that just is not possible so it will fail? But that doesn't make any sense either as all you are doing is for example, reading a bit that should say 10110000 and getting 10010000. A single bit error that will give you the wrong result but why would it actually stop anything.

So all you are really assured is that doing a rebuild, you are likely to get a bit error that will have a chance of changing some file at some point on the RAID disk. But the chances are about the same as you reading a file off the disk and getting a bit error and not knowing it, and then saving that now wrong file back to the disk.

I am perplexed then at what the issue really is ?

scottalanmiller

@scottalanmiller said in Responding to "This BS called URE" from Synology Forums:

The premise is that when doing a RAID rebuild, that the process will stop on the occurrence of one of these read errors that WILL happen at some point in time of the first 11.3TB of data read off any of the disks. But why would this happen? Does the disk itself know that the data it just read was faulty and give an error to the Synology? Isn't that really a MTBF ?? Or is it just that when doing the CRC calculation to try and rebuild the missing block, that the calculation will result in a value that just is not possible so it will fail? But that doesn't make any sense either as all you are doing is for example, reading a bit that should say 10110000 and getting 10010000. A single bit error that will give you the wrong result but why would it actually stop anything.

So all you are really assured is that doing a rebuild, you are likely to get a bit error that will have a chance of changing some file at some point on the RAID disk. But the chances are about the same as you reading a file off the disk and getting a bit error and not knowing it, and then saving that now wrong file back to the disk.

I am perplexed then at what the issue really is ?

So there is "premise" and "what the issue really is."

First, it is not a premise, it is how MD RAID, and all enterprise class RAID, works. In parity RAID we don't know what the impact is because the RAID system has no knowledge of the data on top of it and the array acts like it is a file (it is actually a volume, but the difference is the same.) When an exposed URE is encountered, whatever scale the layer is that is affected, is lost. In the case of mirrored RAID or no RAID, it is a sector. One bit is bad, the sector is scrapped. In the case of parity RAID, the minimum size above it is the volume which is mapped to the array. So the entire array is lost because it is a single unit that cannot be safely calculated. This is just parity basics, it's not a premise.

So what the issue really is.... is what has been stated ad nauseum and he is ignoring... that an exposed URE on a parity array being rebuilt causes the array to be in an unsafe state and dropped. It isn't that the array makers want to lose all of that data, it is just the granularity at which it can no longer be trusted.

scottalanmiller

Charler Hooper reports the right data in the next one. We aren't doomed, no one ever said we were. Just learn your storage basics, learn basic math, don't be emotional, pick the RAID level that provides the appropriate level of protection. Basically repeating every RAID thread that ever discussed this, but apparently RoadKill missed before posting a rant.

According to me, we are not doomed. Pick the correct RAID level (RAID 10 or RAID 6) and enterprise class hard drives when required. I suggest avoiding RAID 5 when the total array size will be larger than roughly 2TB, maybe less than that. Hopefully, my response is a bit more clear this time.

scottalanmiller

Roadkill401 follows up with this total nonesense...

Put the whole part of RAID and enterprise class drives to one side. Those can be dealt with later. The side of the URE is that EVERY DRIVE REGARDLESS will at some point in time experience a ure error.

Then you must consider your point:

In kernels prior to about 2.6.15, a read error would cause the same effect as a write error. In later kernels, a read-error will instead cause md to attempt a recovery by overwriting the bad block. i.e. it will find the correct data from elsewhere, write it over the block that failed, and then try to read it back again. If either the write or the re-read fail, md will treat the error the same way that a write error is treated, and will fail the whole device."

So in effect, the issue is not the drive that is causing the issue, but rather a defective programming of Linux. if the OS rather than simply flagging that file on the drive as being corrupt, would rather flag the whole drive, it comes across as a rather short sighted screwup.

But Linux is an OS that can be re-developed, so why don't they just fix the error to stop the carnage?? Sure, you can move to better enterprise drives if you want to minimize the chances of loosing a file or block of files effected by the loss of that sector, but the approach they seem to have taken is to simply kill the whole lot.

BTW: according to the current pdf from WD, the Red Pro is < 1 in 10^15 http://www.wdc.com/wdproducts/library/SpecSheet/ENG/2879-800022.pdf

scottalanmiller

@scottalanmiller said in Responding to "This BS called URE" from Synology Forums:

So in effect, the issue is not the drive that is causing the issue, but rather a defective programming of Linux. if the OS rather than simply flagging that file on the drive as being corrupt, would rather flag the whole drive, it comes across as a rather short sighted screwup.

Um, what? First of all, why is Linux the issue and not every RAID implementation ever made? Nice to attack Linux but ignore hardware RAID, third party RAID, Sun, Oracle, Windows, etc. But that's not even the bad part.

How would a RAID controller flag a file? The RAID level doesn't know even what LVM is put on top of it, let alone what filesystem, let alone what file, let alone have a way to talk to the OS! This really shows that conceptually, he's not even aware of hard drive basics. He things it is all magic.

At the RAID level, this is just a bunch of bits, all zeros and ones. They don't have any meaning, at all. In theory you could write a system with that level of vertical intelligence through the storage stack but, if you did, it would be insanely limited and would not make any sense to look anything like what we use for storage today and none of our normal storage processes would make sense any longer. It's not a horrible idea, but it is so many worlds removed from where we are here and if it had real world utilization on any scale, it is safe to assume that the big players would have implemented it.

So the RAID array has neither a mechanism to alert the layers above it of specific issues, nor does the RAID array know what file is bad. Except, it does....

So in parity RAID, there is a single "file" or virtual file on top of the disks... it's the parity disk itself. It's a single file, for all intents and purposes, that consists of the entirety of the array. And that is the "file" that the array has to flag as corrupt. To the RAID array it is truly failing only a single file. To the OS, it is an array / block device that is lost.

So even going by what Roadkill wants, he still gets exactly what we have already.

scottalanmiller

@scottalanmiller said in Responding to "This BS called URE" from Synology Forums:

if the OS rather than simply flagging that file on the drive as being corrupt, would rather flag the whole drive, it comes across as a rather short sighted screwup.

This is actually what it does. Except not the OS. The RAID controller (hardware or software) flags the file (array) as being corruption, not any drive. Any drive(s) with a URE are flagged as being healthy.

If you were to divide up the drives into many arrays, and you hit an exposed URE, only the single array (file) in which the URE was found would be corrupt. The drives, and other arrays (files) on them would be just as healthy as ever.

It only comes across as a short sighted screw up if you don't realize that the "fix" takes us right back to where we already are.