SAS SSD vs SAS HDD in a RAID 10?
-
@pete-s said in SAS SSD vs SAS HDD in a RAID 10?:
So more equipment = more failures. So if you can manage with fewer drives I would strive for that.
Yes, this is true. More drives, means more drive failures.
-
@dave247 said in SAS SSD vs SAS HDD in a RAID 10?:
@scottalanmiller said in SAS SSD vs SAS HDD in a RAID 10?:
@pete-s said in SAS SSD vs SAS HDD in a RAID 10?:
Remember that every drive you add also increases the risk that one of the disks fails.
Let's assume the annual failure rate for HDDs are 3% on average, as some studies says. With two disks it's 6%, three disks 9%, four disks 12%, 5 disks 15% etc.
It does increase, but not that quickly. With that math, you'd hit 100% with 34 drives. But you never actually get that high, even with 200 drives, you just get close.
And on the inverse, I feel like there's some sort of risk to having only a few really large drives. It's like, maybe too few massive drives are bad and too many tiny drives are bad. Somewhere in that spectrum is a statistical sweet spot, but maybe what I'm currently saying is bs..
Bit failure is related to the size of the drives (number of bits) but annual failure rate doesn't correlate to the size of the drive. Check out backblaze blog for instance on their experience using spinning rust.
-
@dave247 said in SAS SSD vs SAS HDD in a RAID 10?:
@scottalanmiller said in SAS SSD vs SAS HDD in a RAID 10?:
@pete-s said in SAS SSD vs SAS HDD in a RAID 10?:
Remember that every drive you add also increases the risk that one of the disks fails.
Let's assume the annual failure rate for HDDs are 3% on average, as some studies says. With two disks it's 6%, three disks 9%, four disks 12%, 5 disks 15% etc.
It does increase, but not that quickly. With that math, you'd hit 100% with 34 drives. But you never actually get that high, even with 200 drives, you just get close.
And on the inverse, I feel like there's some sort of risk to having only a few really large drives. It's like, maybe too few massive drives are bad and too many tiny drives are bad. Somewhere in that spectrum is a statistical sweet spot, but maybe what I'm currently saying is bs..
Well, mathematically, fewer larger drives present their greatest risk during a prolonged recovery. But the chance that they need to do a recovery at all is lower. If your drives are slow, and recovery takes a really long time, large sizes are riskier.
So RAID 5 and 6 suffer from large drive resilvers more. Fast SSDs in mirrored RAID handle even quite large drives very quickly. It's not the size per se that is an issue, but the time it takes to fill the drive with recovered data.
But the only risk from large drives is that recovery time. So if you run the math, I think you'll find that fewer, larger drives will always outweigh many smaller drives because the reduced chance of drive loss will overshadow the increased risk of secondary failure during a resilver. The faster the drives, the more pronounced the overshadowing. If they ever do have a tipping point, it is with parity on very slow drives (think 5400 RPM.)
-
Semi-On Topic: BackBlaze publishes their reliability rate for the tens of thousands of drives in their fleet.
https://www.backblaze.com/b2/hard-drive-test-data.html
EDIT: Which is contrary to drive manufacturer's publishing ban on said statistics AFAIR.
-
@pete-s said in SAS SSD vs SAS HDD in a RAID 10?:
@dave247 said in SAS SSD vs SAS HDD in a RAID 10?:
@scottalanmiller said in SAS SSD vs SAS HDD in a RAID 10?:
@pete-s said in SAS SSD vs SAS HDD in a RAID 10?:
Remember that every drive you add also increases the risk that one of the disks fails.
Let's assume the annual failure rate for HDDs are 3% on average, as some studies says. With two disks it's 6%, three disks 9%, four disks 12%, 5 disks 15% etc.
It does increase, but not that quickly. With that math, you'd hit 100% with 34 drives. But you never actually get that high, even with 200 drives, you just get close.
And on the inverse, I feel like there's some sort of risk to having only a few really large drives. It's like, maybe too few massive drives are bad and too many tiny drives are bad. Somewhere in that spectrum is a statistical sweet spot, but maybe what I'm currently saying is bs..
Bit failure is related to the size of the drives (number of bits) ....
Sort of. URE failures are a factor of the size of the array minus parity, not of individual drive size.
So an array of 10TB usable space is the same URE risk (within measurable margins) whether it is 3x 5TB drives, or 11x 1TB drives or 22x 500GB drives or 44x 250GB drives.
On a per drive basis, each drive is at greater risk, sure. But to a resilver operation, the risk is identical between the different topologies of the same size array.
-
@scottalanmiller said in SAS SSD vs SAS HDD in a RAID 10?:
@dave247 said in SAS SSD vs SAS HDD in a RAID 10?:
@scottalanmiller said in SAS SSD vs SAS HDD in a RAID 10?:
@pete-s said in SAS SSD vs SAS HDD in a RAID 10?:
Remember that every drive you add also increases the risk that one of the disks fails.
Let's assume the annual failure rate for HDDs are 3% on average, as some studies says. With two disks it's 6%, three disks 9%, four disks 12%, 5 disks 15% etc.
It does increase, but not that quickly. With that math, you'd hit 100% with 34 drives. But you never actually get that high, even with 200 drives, you just get close.
And on the inverse, I feel like there's some sort of risk to having only a few really large drives. It's like, maybe too few massive drives are bad and too many tiny drives are bad. Somewhere in that spectrum is a statistical sweet spot, but maybe what I'm currently saying is bs..
Well, mathematically, fewer larger drives present their greatest risk during a prolonged recovery. But the chance that they need to do a recovery at all is lower. If your drives are slow, and recovery takes a really long time, large sizes are riskier.
So RAID 5 and 6 suffer from large drive resilvers more. Fast SSDs in mirrored RAID handle even quite large drives very quickly. It's not the size per se that is an issue, but the time it takes to fill the drive with recovered data.
But the only risk from large drives is that recovery time. So if you run the math, I think you'll find that fewer, larger drives will always outweigh many smaller drives because the reduced chance of drive loss will overshadow the increased risk of secondary failure during a resilver. The faster the drives, the more pronounced the overshadowing. If they ever do have a tipping point, it is with parity on very slow drives (think 5400 RPM.)
hmm.. well now I don't know what to do. It's either RAID 1, 5 or 6 with SSD drives. Price tag aside, I just want whatever is going to be most reliable for the server. Pure and simple, I want to minimize the chances that a volume fails and the server goes down and I have to restore from a backup.
-
@dave247 said in SAS SSD vs SAS HDD in a RAID 10?:
@scottalanmiller said in SAS SSD vs SAS HDD in a RAID 10?:
@dave247 said in SAS SSD vs SAS HDD in a RAID 10?:
@scottalanmiller said in SAS SSD vs SAS HDD in a RAID 10?:
@pete-s said in SAS SSD vs SAS HDD in a RAID 10?:
Remember that every drive you add also increases the risk that one of the disks fails.
Let's assume the annual failure rate for HDDs are 3% on average, as some studies says. With two disks it's 6%, three disks 9%, four disks 12%, 5 disks 15% etc.
It does increase, but not that quickly. With that math, you'd hit 100% with 34 drives. But you never actually get that high, even with 200 drives, you just get close.
And on the inverse, I feel like there's some sort of risk to having only a few really large drives. It's like, maybe too few massive drives are bad and too many tiny drives are bad. Somewhere in that spectrum is a statistical sweet spot, but maybe what I'm currently saying is bs..
Well, mathematically, fewer larger drives present their greatest risk during a prolonged recovery. But the chance that they need to do a recovery at all is lower. If your drives are slow, and recovery takes a really long time, large sizes are riskier.
So RAID 5 and 6 suffer from large drive resilvers more. Fast SSDs in mirrored RAID handle even quite large drives very quickly. It's not the size per se that is an issue, but the time it takes to fill the drive with recovered data.
But the only risk from large drives is that recovery time. So if you run the math, I think you'll find that fewer, larger drives will always outweigh many smaller drives because the reduced chance of drive loss will overshadow the increased risk of secondary failure during a resilver. The faster the drives, the more pronounced the overshadowing. If they ever do have a tipping point, it is with parity on very slow drives (think 5400 RPM.)
hmm.. well now I don't know what to do. It's either RAID 1, 5 or 6 with SSD drives. Price tag aside, I just want whatever is going to be most reliable for the server.
Under normal conditions, RAID 1 blows everything else out of the water on reliability. It's both protected mathematically by having the fewest parts to cause failure. And it is vastly simpler in implementation making any implementation significantly less likely to fail. So both the drives themselves and the RAID mechanism are at their maximum for safety.
Any variation, whether parity or adding striping takes what RAID 1 has and adds both mechanical (disk) and logical (implementation) risks. It's impossible for any other RAID level to approach RAID 1 in safety.
-
@dave247 said in SAS SSD vs SAS HDD in a RAID 10?:
Pure and simple, I want to minimize the chances that a volume fails and the server goes down and I have to restore from a backup.
Then RAID 1 is always your best choice, regardless of the medium it is protecting. RAID 1 is just as good as we can make it. It's the "two bricks" approach. Almost nothing to go wrong. And in real world testing, it's so stable it is impossible to measure failure rates on it. You start talking about it in atomic decay terms.
And if you want to take it to insanity levels, get a third drive and triple mirror. You get into "humanity has never witnessed a failure" level of reliability. You are more likely to be killed inside the datacenter by frozen poop falling from a passing plane than to lose those drives.
-
People don't do RAID 1 with triple drives, because at that point your drives are so safe that it's silly to spend money there. You'd be better spending it somewhere else. It's not your drives that fail, but other components.
-
@scottalanmiller said in SAS SSD vs SAS HDD in a RAID 10?:
@dave247 said in SAS SSD vs SAS HDD in a RAID 10?:
Pure and simple, I want to minimize the chances that a volume fails and the server goes down and I have to restore from a backup.
Then RAID 1 is always your best choice, regardless of the medium it is protecting. RAID 1 is just as good as we can make it. It's the "two bricks" approach. Almost nothing to go wrong. And in real world testing, it's so stable it is impossible to measure failure rates on it. You start talking about it in atomic decay terms.
And if you want to take it to insanity levels, get a third drive and triple mirror. You get into "humanity has never witnessed a failure" level of reliability. You are more likely to be killed inside the datacenter by frozen poop falling from a passing plane than to lose those drives.
ok, well if I want to do a RAID 1 then, I've got these as options as they are almost 4TB:
- 3.84TB SSD SAS Read Intensive 12Gbps 512n 2.5in Hot-plug Drive, PX05SR,1 DWPD,7008 TBW - $4,673.84 /ea.
- 3.84TB SSD SAS Read Intensive 12Gb 512e 2.5in Hot-plug Drive, PM1633a,1 DWPD,7008 TBW - $4,391.49 /ea.
- 3.84TB SSD SATA Read Intensive 6Gbps 512n 2.5in Hot-plug Drive, PM863a - $3,262.09 /ea.
- 3.84TB SSD SATA Read Intensive 6Gbps 512e 2.5in Hot-plug Drive, S4500,1 DWPD,7008 TBW - $3,262.09 /ea.
And I could toss out the H740P and go back to the H330
-
@dave247 said in SAS SSD vs SAS HDD in a RAID 10?:
ok, well if I want to do a RAID 1 then, I've got these as options as they are almost 4TB:
- 3.84TB SSD SAS Read Intensive 12Gbps 512n 2.5in Hot-plug Drive, PX05SR,1 DWPD,7008 TBW - $4,673.84 /ea.
- 3.84TB SSD SAS Read Intensive 12Gb 512e 2.5in Hot-plug Drive, PM1633a,1 DWPD,7008 TBW - $4,391.49 /ea.
- 3.84TB SSD SATA Read Intensive 6Gbps 512n 2.5in Hot-plug Drive, PM863a - $3,262.09 /ea.
- 3.84TB SSD SATA Read Intensive 6Gbps 512e 2.5in Hot-plug Drive, S4500,1 DWPD,7008 TBW - $3,262.09 /ea.
And I could toss out the H740P and go back to the H330
You pay a severe Dell tax on those prices.
PM863a is a Samsung drive and the real price is around $1500.
S4500 is Intel but older slower model as the newer one is S4510. Real price on the newer model is around $1500.Don't have prices on PX05SR (Toshiba) or PM1633a (Samsung) but similar drive HGST Ultrastar SS300 is around $2800, Seagate 1200.2 is around $2500.
With real price I mean what you pay if you buy one drive from just about anywhere.
I wouldn't waste any money on SAS 12Gbps drives (unless you need dual port) because if you need maximum performance U.2 NVMe is what you want. Don't be fooled by "read intensive" either - 1 DWPD means you can write 3.8TB per day for 5 years.
-
@pete-s said in SAS SSD vs SAS HDD in a RAID 10?:
@dave247 said in SAS SSD vs SAS HDD in a RAID 10?:
ok, well if I want to do a RAID 1 then, I've got these as options as they are almost 4TB:
- 3.84TB SSD SAS Read Intensive 12Gbps 512n 2.5in Hot-plug Drive, PX05SR,1 DWPD,7008 TBW - $4,673.84 /ea.
- 3.84TB SSD SAS Read Intensive 12Gb 512e 2.5in Hot-plug Drive, PM1633a,1 DWPD,7008 TBW - $4,391.49 /ea.
- 3.84TB SSD SATA Read Intensive 6Gbps 512n 2.5in Hot-plug Drive, PM863a - $3,262.09 /ea.
- 3.84TB SSD SATA Read Intensive 6Gbps 512e 2.5in Hot-plug Drive, S4500,1 DWPD,7008 TBW - $3,262.09 /ea.
And I could toss out the H740P and go back to the H330
You pay a severe Dell tax on those prices.
PM863a is a Samsung drive and the real price is around $1500.
S4500 is Intel but older slower model as the newer one is S4510. Real price on the newer model is around $1500.Don't have prices on PX05SR (Toshiba) or PM1633a (Samsung) but similar drive HGST Ultrastar SS300 is around $2800, Seagate 1200.2 is around $2500.
With real price I mean what you pay if you buy one drive from just about anywhere.
I wouldn't waste any money on SAS 12Gbps drives (unless you need dual port) because if you need maximum performance U.2 NVMe is what you want. Don't be fooled by "read intensive" either - 1 DWPD means you can write 3.8TB per day for 5 years.
Damn Dell prices... They are so high. I see on xbyte, the PM863a is a lot cheaper, though I can't tell if that's a used/refurb part. What other places would you suggest I look?
-
Disctech may have good prices too.
-
@dave247 said in SAS SSD vs SAS HDD in a RAID 10?:
I see on xbyte, the PM863a is a lot cheaper, though I can't tell if that's a used/refurb part. What other places would you suggest I look?
It's just a bit long in the tooth at this point (That drives been around 2-3 years since the refresh). It's also low endurance TLC with a smallish DRAM and SLC buffer. It's not going to take sustained write throughput very well.
-
@dave247 said in SAS SSD vs SAS HDD in a RAID 10?:
@scottalanmiller said in SAS SSD vs SAS HDD in a RAID 10?:
@dave247 said in SAS SSD vs SAS HDD in a RAID 10?:
@dustinb3403 said in SAS SSD vs SAS HDD in a RAID 10?:
OBR5 is the standard if you are going to be using an SSD
Are there any good sources that express that as best practice? I'm looking for myself now too and by the way....
There can never be a best practice of this sort. It's standard practice to start with RAID 5 for SSD due to the risk types and levels, but not on HDs for the same reason. RAID 10 tends to saturate RAID controllers with SSD, but not with HDs.
As with all RAID, it comes down to price / risk / performance. And for most deployments, RAID 5 gives the best blend with SSDs; and RAID 10 gives the best blend for HDs. But in both cases, RAID 6 is the second most popular choice, and RAID 10 is an option with SSDs.
With SSDs, you rarely do RAID 10. If you really need the speed, you tend to do RAID 1 with giant NVMe cards instead.
Yeah, sorry, I guess I shouldn't have said "best practice". I was more or less looking for some information that would help validate what Dustin said. I wanted to look into it more and educate myself as much as possible.
Well I think if I am able to go with the SSD drives, I will do a RAID 6. I am creating a few different server builds as options that display different levels of performance and cost.
Speaking of my RAID card, I am looking at the H740P which has 8GB of NV cache memory and flash backed cache. I still need to educate myself on this stuff as well because I'm not sure if this is overkill or not. My other option was the H330, which has none of that.
EDIT: Nevermind on the H330 doesn't offer RIAD 6 as an option.
NV Cache is important. Having battery backed cache means there's a maintenance item in the batteries. They wear out or outright die at some point thus impacting performance because the RAID engine will flip over to Write-Through when they disappear. Performance pain would be noticeable on rust maybe not so much on SSD depending on throughput needs.
With today's RAID engines being dual or more processors having more cache RAM is a good thing. More than 4GB of cache RAM? It depends on what the setup will be and what advanced features would be utilized such as SSD Cache add-ons if using a combination of SSD and rust.
Since this setup will be SQL I suggest running a Telegraf/InfluxDB/Grafana setup to baseline the current SQL server's usage patterns. That would give a really good big picture and close-up view to work from and extrapolate future performance needs as things grow.
Suffice it to say, we'd run with maximum count smaller capacity SAS SSDs in RAID 6 with a 2GB minimum NVRAM RAID controller. That should yield at least 15K IOPS per disk and more than enough MiB/Second throughput.
Suggestion: Make sure the entire storage stack is set up at 64KB block sizes to maximize the balance between IOPS and throughput.
-
@phlipelder said in SAS SSD vs SAS HDD in a RAID 10?:
@dave247 said in SAS SSD vs SAS HDD in a RAID 10?:
@scottalanmiller said in SAS SSD vs SAS HDD in a RAID 10?:
@dave247 said in SAS SSD vs SAS HDD in a RAID 10?:
@dustinb3403 said in SAS SSD vs SAS HDD in a RAID 10?:
OBR5 is the standard if you are going to be using an SSD
Are there any good sources that express that as best practice? I'm looking for myself now too and by the way....
There can never be a best practice of this sort. It's standard practice to start with RAID 5 for SSD due to the risk types and levels, but not on HDs for the same reason. RAID 10 tends to saturate RAID controllers with SSD, but not with HDs.
As with all RAID, it comes down to price / risk / performance. And for most deployments, RAID 5 gives the best blend with SSDs; and RAID 10 gives the best blend for HDs. But in both cases, RAID 6 is the second most popular choice, and RAID 10 is an option with SSDs.
With SSDs, you rarely do RAID 10. If you really need the speed, you tend to do RAID 1 with giant NVMe cards instead.
Yeah, sorry, I guess I shouldn't have said "best practice". I was more or less looking for some information that would help validate what Dustin said. I wanted to look into it more and educate myself as much as possible.
Well I think if I am able to go with the SSD drives, I will do a RAID 6. I am creating a few different server builds as options that display different levels of performance and cost.
Speaking of my RAID card, I am looking at the H740P which has 8GB of NV cache memory and flash backed cache. I still need to educate myself on this stuff as well because I'm not sure if this is overkill or not. My other option was the H330, which has none of that.
EDIT: Nevermind on the H330 doesn't offer RIAD 6 as an option.
NV Cache is important. Having battery backed cache means there's a maintenance item in the batteries. They wear out or outright die at some point thus impacting performance because the RAID engine will flip over to Write-Through when they disappear. Performance pain would be noticeable on rust maybe not so much on SSD depending on throughput needs.
With today's RAID engines being dual or more processors having more cache RAM is a good thing. More than 4GB of cache RAM? It depends on what the setup will be and what advanced features would be utilized such as SSD Cache add-ons if using a combination of SSD and rust.
Since this setup will be SQL I suggest running a Telegraf/InfluxDB/Grafana setup to baseline the current SQL server's usage patterns. That would give a really good big picture and close-up view to work from and extrapolate future performance needs as things grow.
Something like Grafana is only the front-end right? Would InfluxDB be the logging component? I would be interested in gathering performance data but I fear that setting something up would be time-consuming and end up not working, as most this stuff seems to go that way.
Suffice it to say, we'd run with maximum count smaller capacity SAS SSDs in RAID 6 with a 2GB minimum NVRAM RAID controller. That should yield at least 15K IOPS per disk and more than enough MiB/Second throughput.
Yeah I was also considering a RAID 6 with 5 SAS SSD drives but then in the discussion on here some were saying RAID 1 or 5 would be good too. I'm still not decided though.
Suggestion: Make sure the entire storage stack is set up at 64KB block sizes to maximize the balance between IOPS and throughput.
Hmm.. I usually leave the defaults on that sort of thing until I know more about the technology. Is 64K usually the default?
-
Hmm.. I usually leave the defaults on that sort of thing until I know more about the technology. Is 64K usually the default?
No. When we deploy, note that we deploy on Storage Spaces, we make sure the stack from the platters/SSD up to the OS are configured with 64KB block sizes for database driven systems. There are exceptions to the rule such as highly active IOPS setups with smaller write sizes that could push that stack to 32KB to get more IOPS out.
For setups that require fairly mundane day to day file work 128KB or 256KB (the usual default) are okay.
For archival storage or storage that hosts something like 4K video files then we'd push out to 512KB or 1024KB depending on the network fabric.
-
@phlipelder I don't understand why we keep talking about SAS/SATA SSDs and RAID performance when it's a dead technology, suitable for legacy applications only?
NVMe drives are many hundreds of percent faster, have much higher IOPS, lower latency and the software stack is much more optimized.
-
@pete-s said in SAS SSD vs SAS HDD in a RAID 10?:
@phlipelder I don't understand why we keep talking about SAS/SATA SSDs and RAID performance when it's a dead technology, suitable for legacy applications only?
NVMe drives are many hundreds of percent faster, have much higher IOPS, lower latency and the software stack is much more optimized.
NVMe is nowhere near as mature a technology as SAS is. The resilience that's built-in to SAS is just not there yet with NVMe. That's why Hyper-Converged is such a big thing.
Local attached storage, such as NVMe, shared out across nodes with resilience built-in at the node local storage level and up.
-
I'm currently reading all about 512n vs 512e right now but I'm not certain on what I should be going with. Any recommendations?