Synology one bad sector crashes whole volume RAID0



  • I plugged in an external USB drive to act as a backup for the NAS. Using the app Hyper Backup it was in the middle of doing its first full backup when suddenly the Synology DS216+ crashes, beeping, blinking orange lights.

    I log in and see all these:

    0_1539714176610_bad sector.png

    The NAS has 2 x 4TB WD Red drives. I'm running them striped since I wanted more space and perhaps speed.

    Anyway, when I go in Storage Manager to the HDD/SDD section, it says disk 2 is crashed and has a bad sector count of 6 now.

    The status light is blinking orange. The Disk 2 light is solid orange. About 5 minutes a splash of new "Bad sector was found" messages pop up.

    Neither drive has had a bad sector up until now. SMART has always been ok.

    Of all the space available, over 7TB, I've only got about 1.2TB on it so far. No problems until doing this backup to the USB drive.

    What I don't get is why something simple like a bad sector is causing the entire volume to crash and everything to halt? Bad sectors happen, why isn't it just marked and move on? If the bad sector was found where a file is stored, why can't it just report that file as corrupt, mark the bad sector and move on? This doesn't make sense, that one little sector causes the entire volume to crash!

    Lastly, it doesn't tell me what to do. It's just like, hmm, backup everything and buy new drives. Over a bad sector!! Try to backup everything, when it's crashed and doesn't let me open any shares? I can't tell if it's doing something in the background, like trying to recover, or if it's just stopped, cause it keeps popping up messages every 5 minutes, so maybe its doing something?

    What the heck do I do then? Turn it off and on and hope it recovers? I don't know.



  • @guyinpv The big question is how many bad sectors? I'd be afraid the errors mean one of the drives ran out of spare sectors to swap out.



  • @guyinpv said in Synology one bad sector crashes whole volume RAID0:

    I'm running them striped since I wanted more space and perhaps speed.

    Disks fail, that is what they do. You're asking for it when running RAID0.

    But now you just buy new drives and restore your backup.

    WD has some test program that can verify that the disk is broken, then just send it in for warranty replacement - if it's still under warranty. WD Red had 3 years I believe but can be extended to 5 years for a small fee.



  • If you need to save what's on the disk you need to:

    • insert the 1st drive on a linux computer (don't mount it) and make a dd image copy of the entire disk.
      Use options conv=noerror,sync so the drive keeps reading even after errors.
      Expect the cloning to take a long time if you have many bad blocks.
    • do the same with the second disk.
    • mount the cloned disks/images and run fsck on them or use recovery software
    • recover or copy what is possible and copy the data to where you want it.

    Don't do anything else with the failed disks other than clone them. That's data recovery 101.



  • IMO one of the reasons one would have a Synology is for the support. Have you called them?



  • Failed RAID0 = no data. Simple as that..

    Always possible to get super lucky, but don't count on it.



  • After making the images as @Pete-S suggests, you can try running Spinrite on one or both drives and see if it comes back enough to finish your backup.



  • Interestingly, I force turned off the Synology, pulled the drives and did a quick canned air cleanup.

    Turned back on and it came to life. Looking at the drive screen, the count of bad sectors is now at 38.
    This makes no sense, jumping from 0 bad to 38 out of the blue.

    I know enough about drives to know they can recover from bad sectors and avoid those areas of the disk. It's weird to me that it would bring down the entire volume and crash the whole thing over a bad sector.

    Was reading this thread and seems some people thing it could be a DSM bug: https://forum.synology.com/enu/viewtopic.php?f=19&t=93339&start=30

    I did happen to do a DSM upgrade this morning but all was well until I ran that full backup onto the USB drive. It crashed somewhere in the the middle of that backup.



  • Because it attempted to read every sector. This is not a surprise.



  • If a sector really can't be read, even after many tries, that will cause a cascade of issues on a RAID 0 volume. Because those sectors have to work together to recreate the data.



  • Is there some sort of jbod mode or something that is common for wanting a larger drive, giving up the performance of R0? Then, when a drive does fail, it only takes out that drive and not the whole shebang? Is that actually a thing in production use?



  • @donahue said in Synology one bad sector crashes whole volume RAID0:

    Is there some sort of jbod mode or something that is common for wanting a larger drive, giving up the performance of R0? Then, when a drive does fail, it only takes out that drive and not the whole shebang? Is that actually a thing in production use?

    RAID 0 should, like JBOD setups, really just be for ephemeral data, like caches.



  • @scottalanmiller said in Synology one bad sector crashes whole volume RAID0:

    @donahue said in Synology one bad sector crashes whole volume RAID0:

    Is there some sort of jbod mode or something that is common for wanting a larger drive, giving up the performance of R0? Then, when a drive does fail, it only takes out that drive and not the whole shebang? Is that actually a thing in production use?

    RAID 0 should, like JBOD setups, really just be for ephemeral data, like caches.

    that's not an answer to his question.



  • @dashrender said in Synology one bad sector crashes whole volume RAID0:

    @scottalanmiller said in Synology one bad sector crashes whole volume RAID0:

    @donahue said in Synology one bad sector crashes whole volume RAID0:

    Is there some sort of jbod mode or something that is common for wanting a larger drive, giving up the performance of R0? Then, when a drive does fail, it only takes out that drive and not the whole shebang? Is that actually a thing in production use?

    RAID 0 should, like JBOD setups, really just be for ephemeral data, like caches.

    that's not an answer to his question.

    It's the answer he needs, not the answer he wants.

    Individual drives are just "smaller RAID 0s", if you have to worry about the size of the failure domain, it means you can't implement the solution in production.



  • @guyinpv said in Synology one bad sector crashes whole volume RAID0:

    Interestingly, I force turned off the Synology, pulled the drives and did a quick canned air cleanup.

    Turned back on and it came to life. Looking at the drive screen, the count of bad sectors is now at 38.
    This makes no sense, jumping from 0 bad to 38 out of the blue.

    I know enough about drives to know they can recover from bad sectors and avoid those areas of the disk. It's weird to me that it would bring down the entire volume and crash the whole thing over a bad sector.

    It's gonna fail again. No it's not, it's RAID 0, it needs EVERY sector. Think of it as a password, lost one character, you lost the entire password.



  • @guyinpv said in Synology one bad sector crashes whole volume RAID0:

    The NAS has 2 x 4TB WD Red drives. I'm running them striped since I wanted more space and perhaps speed.

    ...

    SATA drives speed > 1 Gbps, there was no speed advantage. Since you didn't need the space, all you did was add risk by running RAID 0.



  • @harry-lui said in Synology one bad sector crashes whole volume RAID0:

    @guyinpv said in Synology one bad sector crashes whole volume RAID0:

    The NAS has 2 x 4TB WD Red drives. I'm running them striped since I wanted more space and perhaps speed.

    ...

    SATA drives speed > 1 Gbps, there was no speed advantage. Since you didn't need the space, all you did was add risk by running RAID 0.

    Yes it was risk. The NAS was originally just going to be an external backup for the server. I only used RAID 0 for the combined space which is close to the what my server has which uses RAID 10 and 4 drives.

    Frankly I just thought it would be more robust. I mean, I know it "can" fail, just didn't think it would be within a year. I also know my car tires can get blowouts, but I don't expect one every month or two either.

    I'll probably replace this WD Red now it's at 39 bad sectors. Redo the RAID with a mirror instead. I'll lose the space but I don't expect to use up 4TB soon anyway.



  • Aren't we discussing something like 7TB usable space? Why is this even a question. Four 8TB drives would give you 8TB usable in a RAID1.

    There is zero benefit listed with what has been described so far with regards to RAID0.



  • @guyinpv said in Synology one bad sector crashes whole volume RAID0:

    Frankly I just thought it would be more robust. I mean, I know it "can" fail, just didn't think it would be within a year. I also know my car tires can get blowouts, but I don't expect one every month or two either.

    Statistically, you'd expect it to be around a year. RAID 0 is incredibly unstable because it takes the risk of a single drive and magnifies it dramatically. So if the average failure of a single drive is, maybe, once every six years, RAID 0 with four drives would make that every 1.5 years on average. And that's just an average. So well inside the bell curve are failures at six months and three years.

    And RAID 0 has failures that cause all data loss that don't cause full data loss on single drives. The RAIDing process making RAID 0 astronomically more dangerous than just 4x the risk of a lone drive.



  • Ah so you actually have two disks with 4TB each and went with the "I need more space" RAID0.

    The fix here is bigger disks and RAID1 in that case, it's going to be slow (being 5400 RPM) but at least you have the protection you were looking for.

    Granted this is backup only.



  • @dustinb3403 said in Synology one bad sector crashes whole volume RAID0:

    Ah so you actually have two disks with 4TB each and went with the "I need more space" RAID0.

    The fix here is bigger disks and RAID1 in that case, it's going to be slow (being 5400 RPM) but at least you have the protection you were looking for.

    Granted this is backup only.

    I'm guessing he didn't buy the drives, but already had them.



  • @scottalanmiller That doesn't matter in terms of the discussion. If backup space is a priority and you can't make it work with the equipment you have you need to purchase more space.

    RAID0 is a non-production setup in most cases like was discussed above. For backups, yeah okay maybe you can skate by. But why risk it if this is the only backup's you might have?



  • @scottalanmiller said in Synology one bad sector crashes whole volume RAID0:

    @dustinb3403 said in Synology one bad sector crashes whole volume RAID0:

    Ah so you actually have two disks with 4TB each and went with the "I need more space" RAID0.

    The fix here is bigger disks and RAID1 in that case, it's going to be slow (being 5400 RPM) but at least you have the protection you were looking for.

    Granted this is backup only.

    I'm guessing he didn't buy the drives, but already had them.

    I did buy them new, but the Synology case itself was expensive too. WD REDs at 4TB * 2 just hit the budget. The whole thing was like $800+.

    I can get buy with RAID1 and 4TB drives for a while. As a backup, I might not get 5 full backup sets or whatever but it will do with some incrementals.



  • how critical are those backups? If you needed them and lost them, do you lose more than $800?



  • @guyinpv said in Synology one bad sector crashes whole volume RAID0:

    @scottalanmiller said in Synology one bad sector crashes whole volume RAID0:

    @dustinb3403 said in Synology one bad sector crashes whole volume RAID0:

    Ah so you actually have two disks with 4TB each and went with the "I need more space" RAID0.

    The fix here is bigger disks and RAID1 in that case, it's going to be slow (being 5400 RPM) but at least you have the protection you were looking for.

    Granted this is backup only.

    I'm guessing he didn't buy the drives, but already had them.

    I did buy them new, but the Synology case itself was expensive too. WD REDs at 4TB * 2 just hit the budget. The whole thing was like $800+.

    I can get buy with RAID1 and 4TB drives for a while. As a backup, I might not get 5 full backup sets or whatever but it will do with some incrementals.

    One full on RAID 1 is better than a hundred on RAID 0.



  • the cost of 2x4tb is basically the same as a single 8tb if we are talking WD red's. One drive at 8TB would be safer than 2x4tb in raid 0. You could always add the second drive and make it raid 1 later.



  • Do you have a backup or do you need the data?

    I do data recovery.



  • Yeah the drives are ~$125 vs ~$260. I'd go with less backups and RAID1 over RAID0.

    I have to ask why is there an $800 limit on this system if you know how many backups you need? Especially if you're forced to use RAID0 to meet the requirement of "5 full backup sets". That would seem to say "I need this much protection, what do I need to spend to get that protection in a reasonable way?"



  • @dustinb3403 said in Synology one bad sector crashes whole volume RAID0:

    I have to ask why is there an $800 limit on this system if you know how many backups you need?

    In theory, this budget is either based off of the perceived value of the data and/or it defines their value of their data.

    The "rule of thumb" for figuring this out is that you would spend around a quarter of the data value at max for protection. So if this was the sole protection, they'd see their data (that is specifically affected by this device) as having a value of no more than $3,200. Which is reasonable, it might be rather unimportant data that's just nice to not have to recreate.



  • @scottalanmiller In that scenario then you'd still have to meet the following point I mentioned just above.

    What do I need to spend to protect me from a $3,200 loss? It's clearly more than the money that was spent already, I'd say by about $300.


Log in to reply