NVMe and RAID?
-
@PhlipElder I didn't see the option playing with the Dell system builder online. Honestly, the price of NVMe is so ridiculously expensive I'm not sure it warrants much investigation at this point. Single drives are in the $3k and up range depending on the size.
@scottalanmiller said in NVMe and RAID?:
@biggen said in NVMe and RAID?:
What about 10GBe bonding with SAS SSDs? Would you remove the network bottleneck with 20Gbe links?
Bonding will help with bandwidth, but will hurt latency. When you bond, the CPU has to do some work before things are sent down the pipe, so this slows things down. And you don't get 100% efficiency. A two way bond is pretty good, you get something akin to 195% the performance of a single pipe. The third bond is much less. The fourth, way less. And the fifth, so little that no one even discusses it. By the sixth, it's assumed that you are just getting slower, not faster.
What do you suggest then for removing the network bottleneck of 12GB SAS SSDs? Bonded twin 10Gbe links or a single 25GBe connection? The cost of switches and adapter cards goes up I'm assuming moving to 25Gbe.
-
@biggen Right now, the only place we're using NVME in servers is for either cache in a hybrid storage setting (cache/capacity or cache/performance/capacity) or for servers with all NVMe.
Intel's VROC plug-in dongle enables RAID 1 in certain settings. That's driven by the CPU. Not sure Dell supports it.
For most applications, an R740xd with high performance NV-Cache and SATA SSD in RAID 6 will do. Intel SSD DC-S4610 series (D3-4610).
We have plenty of setups like that for virtualized SQL/database workloads as well as 4K/8K video storage.
EDIT: Forgot, in the Intel Server Systems we deploy we install a couple Intel NVMe drives, the VROC dongle for Intel only NVMe, and RAID 1 them for the host OS.
-
@scottalanmiller said in NVMe and RAID?:
@biggen said in NVMe and RAID?:
@brianlittlejohn So I guess the days of having a Jr. Admin blind swap are over then? It takes much more care and instruction to use software RAID.
That's correct. If you want that level of performance, blind swapping is kind of over. For now.
Not so much, at least for the major server vendors. Every back plane will let you flash the activity light of a failed drive. I know iDRAC and md allow you to do this (haven't had a reason to look into HP or Supermicro, but would be surprised if it were not built in as well.)
-
@PhlipElder said in NVMe and RAID?:
@biggen Right now, the only place we're using NVME in servers is for either cache in a hybrid storage setting (cache/capacity or cache/performance/capacity) or for servers with all NVMe.
Intel's VROC plug-in dongle enables RAID 1 in certain settings. That's driven by the CPU. Not sure Dell supports it.
For most applications, an R740xd with high performance NV-Cache and SATA SSD in RAID 6 will do. Intel SSD DC-S4610 series (D3-4610).
We have plenty of setups like that for virtualized SQL/database workloads as well as 4K/8K video storage.
EDIT: Forgot, in the Intel Server Systems we deploy we install a couple Intel NVMe drives, the VROC dongle for Intel only NVMe, and RAID 1 them for the host OS.
Thanks for that info. Yeah, I'm thinking NVMe is probably overkill for video editing over a network connection. Especially considering the fact that he would be network bound anyway. I was thinking either 12Gb SAS SSDs in RAID 1 (2TB+ variety) or 6Gb SATA SSDs in Raid 1. This at least gives the option to go back to hot/blind swap with the appropriate PERC.
@travisdh1 said in NVMe and RAID?:
@scottalanmiller said in NVMe and RAID?:
@biggen said in NVMe and RAID?:
@brianlittlejohn So I guess the days of having a Jr. Admin blind swap are over then? It takes much more care and instruction to use software RAID.
That's correct. If you want that level of performance, blind swapping is kind of over. For now.
Not so much, at least for the major server vendors. Every back plane will let you flash the activity light of a failed drive. I know iDRAC and md allow you to do this (haven't had a reason to look into HP or Supermicro, but would be surprised if it were not built in as well.)
That allows you to ID a bum drive but you still have no way to rebuild it automatically like you would in a blind swap, right?
-
@biggen said in NVMe and RAID?:
@PhlipElder said in NVMe and RAID?:
@biggen Right now, the only place we're using NVME in servers is for either cache in a hybrid storage setting (cache/capacity or cache/performance/capacity) or for servers with all NVMe.
Intel's VROC plug-in dongle enables RAID 1 in certain settings. That's driven by the CPU. Not sure Dell supports it.
For most applications, an R740xd with high performance NV-Cache and SATA SSD in RAID 6 will do. Intel SSD DC-S4610 series (D3-4610).
We have plenty of setups like that for virtualized SQL/database workloads as well as 4K/8K video storage.
EDIT: Forgot, in the Intel Server Systems we deploy we install a couple Intel NVMe drives, the VROC dongle for Intel only NVMe, and RAID 1 them for the host OS.
Thanks for that info. Yeah, I'm thinking NVMe is probably overkill for video editing over a network connection. Especially considering the fact that he would be network bound anyway. I was thinking either 12Gb SAS SSDs in RAID 1 (2TB+ variety) or 6Gb SATA SSDs in Raid 1. This at least gives the option to go back to hot/blind swap with the appropriate PERC.
We deployed an Intel Server System R2224WFTZSR 2U dual socket with a pair of Intel Xeon Gold 6240Y processors. We set up two dual-port Intel x540-T2 10GbE network adapters and a pair of LSI SAS HBAs for external SAS cable connections. It's purpose was to host two to four virtual machines for 150 to 300 1080P cameras throughout a building.
Between 5 and 15 of those camera streams would be processed by recognition software and fire e-mail flags off to management staff for various conditions.
Storage is a pair of Intel SSDs for the host OS, a pair of Intel SSD D3-S4610 series in RAID 1 for the high I/O processing, and an HGST 60-bay JBOD loaded with 12TB NearLine SAS drives.
We used Storage Spaces to set up a 3-way mirror on the drives in the JBOD yielding 33% production storage.
Constant throughput is about 375MB/Second to 495MB/Second depending on how many folks are moving through the building.
We've put a number of other virtual machines on the server to utilize more CPU.
4K video editing is something we have on the radar for these folks as they've started filming their vignettes and other recordings in 4K.
-
@biggen said in NVMe and RAID?:
I was playing around on the Dell configuration website building out an Epyc 2 socket machine with an NVMe backplane. What I noticed is there is no RAID availability for this configuration. How is this handled then if I wanted to put in two identical NVMe U.2 drives and mirror them? Is hardware RAID not an option for this configuration? Is this left to the OS you choose now?
We spec'd a handful of those Epyc 2 Dells with NVMe last year for a hyperconverged cluster.
Intel has VROC which is md raid (software raid) behind the scenes but that doesn't work on AMD CPUs. And you need BIOS support etc.
But I know people who put in 8 NVMe drives and run standard md raid with massive performance numbers.
Blind swap is not a big deal. You can fix that with a simple cron job. If the array is degraded and you put in a new drive in the old slot with the same or larger capacity it will automatically start a rebuild.
-
@PhlipElder What is the camera VMS solution you are using? Milestone? Axis?
-
@Pete-S I'll have to look again then at Intel offering. I figured AMD had Intel blown out of the water as far as cost-per-core offerings go nowadays.
-
@biggen said in NVMe and RAID?:
@Pete-S I'll have to look again then at Intel offering. I figured AMD had Intel blown out of the water as far as cost-per-core offerings go nowadays.
On a pound for pound basis the AMD EPYC Rome platforms we are working with are less expensive and vastly superior in performance.
-
I don't see any VROC mentioning in the system builder for any of the configurations I've done for the Intel systems. I'm guessing that is because Dell wants you to buy a PERC instead.
-
@biggen said in NVMe and RAID?:
I don't see any VROC mentioning in the system builder for any of the configurations I've done for the Intel systems. I'm guessing that is because Dell wants you to buy a PERC instead.
Probably.
So long as you're using some linux based platform for the host, it shouldn't be an issue. All of them support booting to some sort of software RAID.
-
@travisdh1 Yeah it would be a Debian VM providing the SMB share (via Proxmox or xcp-ng) so MD RAID isn't an issue. Proxmox can use ZFS Raid 1 whilst xcp-ng can do standard MD RAID.
Edit: Dell even has that BOSS add-in system that allows for a RAID 1 bootable volume just for the OS. The NVMe drives could be VM storage only if I go that route.
-
@biggen NVMe storage is indeed ridiculously fast. When I say fast think about its latency rather than throughput. In practice, their performance really shines with heavily used relational DBs. Doing RAID over the network with NVMe would require at least 25 GbE with RDMA support end-to-end and would work even better with NVMeoF initiator. Otherwise, network latency would be a bottleneck. However, for 4k video editing, 10 GbE end-to-end with SSD storage on the server should be sufficient.
There is a better alternative than interface bonding between a single file server and clients, it's called SMB multi-channel support that uses multiple network interfaces for data transfers (clients need to have multiple NICs though). This way network bandwidth is aggregated with active-active paths not load balanced with active-passive. The downside is SMB Multichannel works reliably in all Windows environment, its Samba implementation is patchy. Mac OS doesn't support it at all AFAIK.
-
NVMe drives are the same price as SAS3 - with the same write endurance / manufacturer.
If you go Dell, because you want them holding your hand, you'll pay the 2-3 times as much for the drives. That's just the way it is.
Consider that more than one person can access the fileserver at the same time,. You can get away with 10GbE at the clients (bonding doesn't help at the client). That means a 100 GB video file will take 100 seconds to transfer.
However you need more than that on the server and your array need to be able to handle more than 1 GB (gigabyte) per sec.
Most 10GbE switches have 40GbE ports as well. So a two port 40GbE NIC on the server will allow 8 streams of 1 GB/sec for a total of 8GB/sec.
That means that your array need to handle 8 GB/sec. You need a lot of drives if you're not going with NVMe drives to get that kind of performance.
If you do a fileserver like this, skip the hypervisor completely and run it on bare metal. You'll lose at ton of performance otherwise.
Also, latency means nothing in your application. It's all about transfer rate.
So something like debian on bare metal, md raid and use 4TB or larger NVMe U2 drives.
Go for a CPU with high base frequency. High I/O rates from NVMe drives will use a quite a bit of CPU power. You don't need lots and lots of cores though. Go for drives with 1 DWPD for best value. -
@taurex Thanks for that information. More to go over for me it seems!
@Pete-S I figure going Dell or HPE is the way to go for him. He needs to have a support contract behind something like this and it doesn't need to be me.
I hadn't considered uplinks of 40Gbe+. Makes sense.
Skip the hypervisor, huh? I figured it would add a performance penalty but makes backups that are so much easier. I don't even know how to perform bare metal backups on servers. Backing up the video files being worked on would be easy via a traditional Synology NAS (or custom built solution) but backing up the OS in the event that a update renders it broken would take some thought.
I assume Samba could keep up with 8GB/sec (assumes ~8 users all transferring at the same time) so long as the underlying storage is performant enough so Samba isn't waiting?
-
@Pete-S said in NVMe and RAID?:
If you do a fileserver like this, skip the hypervisor completely and run it on bare metal. You'll lose at ton of performance otherwise.
Agreed. This is one of those rare exceptions.
-
@biggen said in NVMe and RAID?:
I figured it would add a performance penalty but makes backups that are so much easier.
It shouldn't. What do you need to grab.... one Samba config file and the SMB share? Hypervisor won't make backing that up any easier.
-
@scottalanmiller said in NVMe and RAID?:
@Pete-S said in NVMe and RAID?:
If you do a fileserver like this, skip the hypervisor completely and run it on bare metal. You'll lose at ton of performance otherwise.
Agreed. This is one of those rare exceptions.
I'm not sure about this claim? Maybe ten years ago.
The above solution I mentioned has the workloads virtualized. We've had no issues saturating a setup with IOPS or throughput by utilizing virtual machines.
It's all in the system configuration, OS tuning, and fabric putting it all together. Much like setting up a 6.2L boosted application, there's a lot of pieces to the puzzle.
EDIT: As a qualifier, we're an all Microsoft house. No VMware here.
-
@PhlipElder said in NVMe and RAID?:
@scottalanmiller said in NVMe and RAID?:
@Pete-S said in NVMe and RAID?:
If you do a fileserver like this, skip the hypervisor completely and run it on bare metal. You'll lose at ton of performance otherwise.
Agreed. This is one of those rare exceptions.
I'm not sure about this claim? Maybe ten years ago.
The above solution I mentioned has the workloads virtualized. We've had no issues saturating a setup with IOPS or throughput by utilizing virtual machines.
It's all in the system configuration, OS tuning, and fabric putting it all together. Much like setting up a 6.2L boosted application, there's a lot of pieces to the puzzle.
EDIT: As a qualifier, we're an all Microsoft house. No VMware here.
We're not talking about any fabric because we are talking about local NVMe storage. Data goes straight from the drive over the PCIe bus directly to the CPU.
For high performance I/O workloads the difference between virtualized and bare metal has increased, not decreased, because the amount of I/O you can generate has increased.
When everyone was running spinners and SAS, you couldn't generate enough I/O for the small overhead that virtualizing added to matter. A few percent at most.
As NVMe drives becomes faster and faster and CPUs have more and more PCIe lanes it's not difficult to generate massive amount of I/O. Then every little added overhead for each I/O operation will become more and more noticeable. That's because the overhead becomes a larger percentage of the time, as the total time for the I/O operation becomes shorter.
That's why the bare metal cloud market has had massive growth the last three years or so. There is simply no way to compete with bare metal performance.
Typical bare metal server instances that for instance Oracle offers, runs on all NVMe flash local storage. They put 9 NVMe drives on each server. With high performance NVMe drives that's almost 20 Gigabyte of data per second.
-
One of the first Dell Servers with Hotswap NVME was the R7415 so yeah
https://www.dell.com/en-us/work/shop/povw/poweredge-r7415Not sure what others have seen.