RAID rebuild times 16TB drive



  • Thought it might be interesting for some you to know that it takes roughly 24 hours to rebuild a RAID array with 16TB drives - if you let it be alone.

    Modern high capacity drives have really high density platters so as the capacity has increased so has also the sustained transfer rate (sequential read and write speed).

    An array rebuild is basically reading from one or several drives, calculating any parity data and then writing new data to the drive that is being rebuilt. On a modern systems reading from the other drives and calculating any parity data goes as fast or faster than writing to a single drive. So the bottleneck in the rebuilding operation is the sequential write speed of the single drive. That's why it doesn't matter what kind of array it is or how many drives it has - rebuilding will take the same time with the same drive.

    The drives I used was Seagate Exos X16 16TB. They have a write speed of up to 260MB/sec. So you start at that speed and then as the rebuilding process progresses the transfer speed goes down because the heads are moving to the center so the circumference becomes smaller while the disk is still spinning at the same rpm which lowers the surface speed. Transfer rate was around 140MB/sec or so in the end.


    NOTE
    Keep in mind that there can be settings in the RAID controller that controls how fast the rebuild process will go.

    In md raid (linux software RAID) there is a minimum speed and a maximum speed. If you set both higher than the transfer rate of the drive, the system will rebuild the drive as fast as it possibly can.

    echo 300000 >/proc/sys/dev/raid/speed_limit_min
    echo 300000 >/proc/sys/dev/raid/speed_limit_max
    

    If you use the array during rebuilding the sequential speed will drop significantly.

    If you needed maximum performance from your array during the day but would want to rebuild as fast as possible during the night you could change the min and max values at any time to control the rebuilding process.

    On RAID controllers such as those on Dell servers and others, there might be a rebuild speed setting in %. Default is usually 30% so it might be wise to change it then you want to rebuild as fast as possible. Otherwise it will take a lot longer even if the RAID array is idle.



  • Yeah not bad if there is no load on the disks and you can dedicate all resources to a rebuild.



  • @Pete-S said in RAID rebuild times 16TB drive:

    On a modern systems reading from the other drives and calculating any parity data goes as fast or faster than writing to a single drive.

    Still lots and lots of systems today can't do that. Rule of thumb is still that the read, not the write, is the bottleneck to measure, except with RAID 1.



  • @scottalanmiller said in RAID rebuild times 16TB drive:

    @Pete-S said in RAID rebuild times 16TB drive:

    On a modern systems reading from the other drives and calculating any parity data goes as fast or faster than writing to a single drive.

    Still lots and lots of systems today can't do that. Rule of thumb is still that the read, not the write, is the bottleneck to measure, except with RAID 1.

    I'm not exactly sure what you mean but if you have no other I/O going on, a raid rebuild is a sequential streaming operation. Concurrent reads from all other drives in the array and write to one.

    You could have same I/O contention if you have lots of drives and not enough bandwidth but with HDDs that is unlikely because they are relatively slow. Is that what you had in mind?

    On many SSDs, the performance is often asymmetric with faster reads than writes.



  • @Obsolesce said in RAID rebuild times 16TB drive:

    Yeah not bad if there is no load on the disks and you can dedicate all resources to a rebuild.

    Yes, I thought so to.

    On some of our older systems using 4TB drives, the rebuild time is 8 hours.



  • @Pete-S said in RAID rebuild times 16TB drive:

    @scottalanmiller said in RAID rebuild times 16TB drive:

    @Pete-S said in RAID rebuild times 16TB drive:

    On a modern systems reading from the other drives and calculating any parity data goes as fast or faster than writing to a single drive.

    Still lots and lots of systems today can't do that. Rule of thumb is still that the read, not the write, is the bottleneck to measure, except with RAID 1.

    I'm not exactly sure what you mean but if you have no other I/O going on, a raid rebuild is a sequential streaming operation. Concurrent reads from all other drives in the array and write to one.

    You could have same I/O contention if you have lots of drives and not enough bandwidth but with HDDs that is unlikely because they are relatively slow. Is that what you had in mind?

    On many SSDs, the performance is often asymmetric with faster reads than writes.

    Its a system, not an IO, bottleneck typically. Especially with RAID 6. Its math that runs on a single thread.



  • Just so I understand what you are saying, you are indicating that it doesn't matter the array size or RAID type, that it will simply take ~24hrs to fully write to those 16TB drives 100% if the system isn't doing anything other than a rebuild?

    So, for example, a RAID 1 16TB mirror would have the same rebuild time as a RAID 6 32TB array (4 x 16TB) or a RAID 10 32TB array (4 x 16TB)? I must be misunderstanding.



  • @biggen said in RAID rebuild times 16TB drive:

    Just so I understand what you are saying, you are indicating that it doesn't matter the array size or RAID type, that it will simply take ~24hrs to fully write to those 16TB drives 100% if the system isn't doing anything other than a rebuild?

    So, for example, a RAID 1 16TB mirror would have the same rebuild time as a RAID 6 32TB array (4 x 16TB) or a RAID 10 32TB array (4 x 16TB)? I must be misunderstanding.

    Assuming no bottlenecks then yes, rebuild time will be the same regardless of the size of the array.

    It's the same amount of data (one full drive) that has to be written to the new empty drive regardless of the size of the array.

    rebuilding_raid6.png

    rebuilding_raid1.png



  • @biggen said in RAID rebuild times 16TB drive:

    Just so I understand what you are saying, you are indicating that it doesn't matter the array size or RAID type, that it will simply take ~24hrs to fully write to those 16TB drives 100% if the system isn't doing anything other than a rebuild?

    Right, as long as the system can dedicate enough resources to the process so that the only bottleneck is the write process. This has always been the case and always been known. What people almost never observe is a situation where you have a system fast enough to do this. Typically the parity reconstruction uses so much CPU and cache that there are bottlenecks. That's why on a typical NAS, for example, we see reconstruction taking weeks or months. Both because the system is bottlenecked heavily, and because it is still in use a little.



  • @biggen said in RAID rebuild times 16TB drive:

    So, for example, a RAID 1 16TB mirror would have the same rebuild time as a RAID 6 32TB array (4 x 16TB) or a RAID 10 32TB array (4 x 16TB)? I must be misunderstanding.

    Under the situation where the only bottleneck in question is the write speed of the receiving drive, you'd absolutely expect the write time to be the same, because the source is taken out of the equation (by the phrase "no other bottlenecks") and the absolute only factor in play is the write speed of the same drive. So of course it will be identical.



  • @scottalanmiller @Pete-S

    Excellent. Thanks for that explanation guys and that nifty diagram Pete!

    I guess I was skeptical I had correct what @Pete-S said because I've seen so many reports that its taken days/weeks to rebuild [insert whatever size] TB Raid 6 arrays in the past. But I guess that was because those systems weren't just idle. There was still IOPS on those arrays AND a possible CPU/cache bottleneck.



  • @biggen said in RAID rebuild times 16TB drive:

    @scottalanmiller @Pete-S

    Excellent. Thanks for that explanation guys and that nifty diagram Pete!

    I guess I was skeptical I had correct what @Pete-S said because I've seen so many reports that its taken days/weeks to rebuild [insert whatever size] TB Raid 6 arrays in the past. But I guess that was because those systems weren't just idle. There was still IOPS on those arrays AND a possible CPU/cache bottleneck.

    We don't see any bottlenecks on our software RAID-6 arrays but they run bare metal on standard servers. That might be atypical, I don't know.

    But I think regular I/O has a much bigger effect than any bottleneck. I can see how MB/sec takes a nose dive when rebuilding and there is activity on the drive array.

    If you think about it, when the drive only does rebuilding it's just doing sequential read/writes and hard drives are up to 50% as fast as SATA SSDs at this. But when other I/O comes in, it becomes a question of IOPS. And hard drives are really bad at this and only have about 1% of the IOPS of an SSDs.



  • @biggen said in RAID rebuild times 16TB drive:

    I guess I was skeptical I had correct what @Pete-S said because I've seen so many reports that its taken days/weeks to rebuild [insert whatever size] TB Raid 6 arrays in the past. But I guess that was because those systems weren't just idle. There was still IOPS on those arrays AND a possible CPU/cache bottleneck.

    Well you are still skipping the one key phrase "no other bottlenecks." According to most reports, there are normally extreme bottlenecks (either because of computational time and/or systems not being completely idle) so the information you are getting is in no way counter to what you've already heard as reports.

    You are responding as if you feel that this is somehow different, but it is not.

    It's a bit like hearing that a Chevy Sonic can go 200 mph when dropped out of an airplane, and then saying that most people say that they never get it over 90mph, and ignoring the obvious key fact that it's being dropped out of an airplane that let's it go so fast.



  • @Pete-S said in RAID rebuild times 16TB drive:

    We don't see any bottlenecks on our software RAID-6 arrays but they run bare metal on standard servers. That might be atypical, I don't know.

    Even on bare metal, we normally see a lot of bottlenecks. But normally because almost no one can make their arrays go idle during a rebuild cycle. If they could, they'd not need the rebuild in the first place, typically.


  • Vendor

    @scottalanmiller said in RAID rebuild times 16TB drive:

    Its a system, not an IO, bottleneck typically. Especially with RAID 6. Its math that runs on a single thread.

    Distributed storage systems with per object raid FTW here. If I have every VMDK running it's own rebuild process (vSAN) or every individual LUN/CPG (how Compellent or 3PAR do it) then a given drive failing is a giant party across all of the drives in the cluster/system. (Also how the fancy erasure code array systems run this).


  • Vendor

    @scottalanmiller said in RAID rebuild times 16TB drive:

    Even on bare metal, we normally see a lot of bottlenecks. But normally because almost no one can make their arrays go idle during a rebuild cycle. If they could, they'd not need the rebuild in the first place, typically.

    Our engineers put in a default "reserve 80% of max throughput for production" IO schedular QoS system, so at saturation rebuilds only get 20% so they don't murder production IO. (note rebuilds can use 100% if the bandwidth is there for the taking).


  • Vendor

    @biggen said in RAID rebuild times 16TB drive:

    I guess I was skeptical I had correct what @Pete-S said because I've seen so many reports that its taken days/weeks to rebuild [insert whatever size] TB Raid 6 arrays in the past. But I guess that was because those systems weren't just idle. There was still IOPS on those arrays AND a possible CPU/cache bottleneck.

    Was the drive full? Smarter new RAID rebuild systems don't rebuild empty LBAs. Every enterprise storage array system has done this with rebuilds for the last 20 years...



  • @StorageNinja No personal experience with it. I've only ever run RAID 1 or 10. Just the reading I've done over the years from people reporting how long it took to rebuild larger RAID 6 arrays.

    BTW, are you the same person who is/was over at Spiceworks? I always enjoyed reading your posts on storage. I respect both you and @scottalanmiller in this arena immensely.



  • @biggen said in RAID rebuild times 16TB drive:

    BTW, are you the same person who is/was over at Spiceworks?

    Yes he is.



  • @biggen said in RAID rebuild times 16TB drive:

    BTW, are you the same person who is/was over at Spiceworks? I always enjoyed reading your posts on storage. I respect both you and @scottalanmiller in this arena immensely.

    Yup, he's one of the "Day Zero" founders over here.



  • @StorageNinja said in RAID rebuild times 16TB drive:

    @scottalanmiller said in RAID rebuild times 16TB drive:

    Its a system, not an IO, bottleneck typically. Especially with RAID 6. Its math that runs on a single thread.

    Distributed storage systems with per object raid FTW here. If I have every VMDK running it's own rebuild process (vSAN) or every individual LUN/CPG (how Compellent or 3PAR do it) then a given drive failing is a giant party across all of the drives in the cluster/system. (Also how the fancy erasure code array systems run this).

    Yeah, that's RAIN and that basically solves everything 🙂