Cross Posting - RAID 0 with Hot Spare
Using an HPE RAID controller, like Smart Array P841, can you assign a hot spare to a RAID 0 array? What happens when a disk fails?
I don't expect any data to survive the disk failure. What I'd like to know is whether the hot spare will immediately be added to the array and made available to the OS (Ubuntu 16.04) and application for writing again.
I plan on building a cluster of servers where the data is replicated between each node in the cluster. I say that to clarify that this use case doesn't require any redundancy at the disk level and I understand that losing a disk in RAID 0 will result in complete data loss. I just want the array to be available for writes again as quickly as possible without manual intervention.
For the curious and those that need specifics, I'm building the following:
Smart Array P841/P441 or maybe an H241 if that makes more sense for RAID 0
25 non-HPE SSD disk (Micron D630 or Samsung SM863, haven't decided yet)
5x 4-disk RAID 0 with 5 hot spare or some variation of that with less hot spares
Elasticsearch cluster acting as a SIEM
I appreciate your insight!
So first I'm guessing that this RAID 0 is for drive performance, and that Ubuntu 16.04 isn't installed to this drive.
Now the issue with a cluster of RAID 0's is that the data from 1 node is replicated to the other nodes.
So if you do lose a drive in the RAIN, you lose the data everywhere.
The question that I have for this configure is what purpose does it serve to setup a RAIN 0 to data computing, with spare drives?
Why not just setup an SSD RAIN 5 or RAIN 10 if you really needed performance?
This way you still protect the individual RAIDs and your data in the event of a failure.
His setup is not crazy, but what he wants to do isn't exactly possible. His end is possible, his means are not. Stating HOT spare is leading his astray.
Hot spare is 100% meaningless with RAID 0. When the disk fails on RAID 0, the array is gone. GONE. The hot spare would have nowhere to be added to.
Even if you did automate adding the hot spare to the remaining drive(s) and building a new RAID 0 array... it would be a NEW array. So your OS would need to add LVM to it, format it and mount it as a new array. So even if you could use a hot spare with RAID 0, which you can't, it would do you no good further up the stack because the array is totally lost already and you have to intervene to get something built again.
I understand your goal and the idea isn't bad, it's just not possible. Now, what IS possible, if you really want to do this and are set up as I imagine you are from the description (that is, assuming that this is a data-only array and not the OS... so we assume that the OS is always still intact...)
- Have a WARM spare (no such thing as a HOT spare with RAID 0, it just is nonsensical, it's a meaningless term to use in that case.) This will sit idly by.
- Have a monitoring script that looks for an array failure.
- Have the script talk to the Smart Array utility and disengage the dead drive.
- Have the script talk to the utility and add in the WARM spare to a new array group.
- Have the script built a new RAID 0 array.
- Have the script unmount the old array and remove any legacy bits of it.
- Detect the new array.
- LVM the new array.
- Format the new array.
- Mount the new array.
- Signal to the node that it is ready to proceed.
So I think that this does what you want, but it is not a HOT spare and it is not the RAID controller doing the work, it is your own script. So your end result can be done, just not using the exact tool and location of tool that you were imagining. But having a spare on board and rebuilding the system is possible.
But most likely there will be a better way to skin this cat. But I can see a potential use case for this.