RAID10 - Two Drive Failure
-
@Dashrender said in RAID10 - Two Drive Failure:
@gjacobse said in RAID10 - Two Drive Failure:
@wirestyle22 said in RAID10 - Two Drive Failure:
@gjacobse said in RAID10 - Two Drive Failure:
@JaredBusch said in RAID10 - Two Drive Failure:
Predictive failure is not failure. Replace one at a time. to give the RAID card the most power to work on the individual resilver.
in my experience - you
never
replace more than one drive at a time...Ask me how I know.
That's very interesting. I have not really had to deal with drive failures actually.
I haven't kept up with the number of drive failures I have had over nearly 30 years. In all of my personal systems, I think I have had
one
.Work related,.. maybe all of three.
Wow, that's pretty small.
Personally, I've probably lost 3-4 drives. In businesses - well over 10.
And Scott has probably seen hundreds fail. Of course it all boils down to how many systems you see/support.
Your footprint is pretty small though so I think that's a pretty big number for you
-
Predictive failure is usually a report from SMART about the drives. It means that something isn't kosher.
-
@Dashrender said in RAID10 - Two Drive Failure:
@gjacobse said in RAID10 - Two Drive Failure:
@wirestyle22 said in RAID10 - Two Drive Failure:
@gjacobse said in RAID10 - Two Drive Failure:
@JaredBusch said in RAID10 - Two Drive Failure:
Predictive failure is not failure. Replace one at a time. to give the RAID card the most power to work on the individual resilver.
in my experience - you
never
replace more than one drive at a time...Ask me how I know.
That's very interesting. I have not really had to deal with drive failures actually.
I haven't kept up with the number of drive failures I have had over nearly 30 years. In all of my personal systems, I think I have had
one
.Work related,.. maybe all of three.
Wow, that's pretty small.
Personally, I've probably lost 3-4 drives. In businesses - well over 10.
And Scott has probably seen hundreds fail. Of course it all boils down to how many systems you see/support.
We have one or two fail every two-three months. Nothing crazy.
-
@wirestyle22 said in RAID10 - Two Drive Failure:
@Dashrender said in RAID10 - Two Drive Failure:
@gjacobse said in RAID10 - Two Drive Failure:
@wirestyle22 said in RAID10 - Two Drive Failure:
@gjacobse said in RAID10 - Two Drive Failure:
@JaredBusch said in RAID10 - Two Drive Failure:
Predictive failure is not failure. Replace one at a time. to give the RAID card the most power to work on the individual resilver.
in my experience - you
never
replace more than one drive at a time...Ask me how I know.
That's very interesting. I have not really had to deal with drive failures actually.
I haven't kept up with the number of drive failures I have had over nearly 30 years. In all of my personal systems, I think I have had
one
.Work related,.. maybe all of three.
Wow, that's pretty small.
Personally, I've probably lost 3-4 drives. In businesses - well over 10.
And Scott has probably seen hundreds fail. Of course it all boils down to how many systems you see/support.
Your footprint is pretty small though so I think that's a pretty big number for you
Remember, I used to be a consultant like JB - so I have a larger exposure than a single man SMB person.
-
@Dashrender said in RAID10 - Two Drive Failure:
@wirestyle22 said in RAID10 - Two Drive Failure:
@Dashrender said in RAID10 - Two Drive Failure:
@gjacobse said in RAID10 - Two Drive Failure:
@wirestyle22 said in RAID10 - Two Drive Failure:
@gjacobse said in RAID10 - Two Drive Failure:
@JaredBusch said in RAID10 - Two Drive Failure:
Predictive failure is not failure. Replace one at a time. to give the RAID card the most power to work on the individual resilver.
in my experience - you
never
replace more than one drive at a time...Ask me how I know.
That's very interesting. I have not really had to deal with drive failures actually.
I haven't kept up with the number of drive failures I have had over nearly 30 years. In all of my personal systems, I think I have had
one
.Work related,.. maybe all of three.
Wow, that's pretty small.
Personally, I've probably lost 3-4 drives. In businesses - well over 10.
And Scott has probably seen hundreds fail. Of course it all boils down to how many systems you see/support.
Your footprint is pretty small though so I think that's a pretty big number for you
Remember, I used to be a consultant like JB - so I have a larger exposure than a single man SMB person.
Yeah, I was thinking where you currently work. My bad
-
@coliver said in RAID10 - Two Drive Failure:
@Dashrender said in RAID10 - Two Drive Failure:
@gjacobse said in RAID10 - Two Drive Failure:
@wirestyle22 said in RAID10 - Two Drive Failure:
@gjacobse said in RAID10 - Two Drive Failure:
@JaredBusch said in RAID10 - Two Drive Failure:
Predictive failure is not failure. Replace one at a time. to give the RAID card the most power to work on the individual resilver.
in my experience - you
never
replace more than one drive at a time...Ask me how I know.
That's very interesting. I have not really had to deal with drive failures actually.
I haven't kept up with the number of drive failures I have had over nearly 30 years. In all of my personal systems, I think I have had
one
.Work related,.. maybe all of three.
Wow, that's pretty small.
Personally, I've probably lost 3-4 drives. In businesses - well over 10.
And Scott has probably seen hundreds fail. Of course it all boils down to how many systems you see/support.
We have one or two fail every two-three months. Nothing crazy.
How many drives do you have?
-
@Dashrender said in RAID10 - Two Drive Failure:
@coliver said in RAID10 - Two Drive Failure:
@Dashrender said in RAID10 - Two Drive Failure:
@gjacobse said in RAID10 - Two Drive Failure:
@wirestyle22 said in RAID10 - Two Drive Failure:
@gjacobse said in RAID10 - Two Drive Failure:
@JaredBusch said in RAID10 - Two Drive Failure:
Predictive failure is not failure. Replace one at a time. to give the RAID card the most power to work on the individual resilver.
in my experience - you
never
replace more than one drive at a time...Ask me how I know.
That's very interesting. I have not really had to deal with drive failures actually.
I haven't kept up with the number of drive failures I have had over nearly 30 years. In all of my personal systems, I think I have had
one
.Work related,.. maybe all of three.
Wow, that's pretty small.
Personally, I've probably lost 3-4 drives. In businesses - well over 10.
And Scott has probably seen hundreds fail. Of course it all boils down to how many systems you see/support.
We have one or two fail every two-three months. Nothing crazy.
How many drives do you have?
A few hundred for now. Should be under 100 at the end of summer.
-
@coliver said in RAID10 - Two Drive Failure:
@Dashrender said in RAID10 - Two Drive Failure:
@coliver said in RAID10 - Two Drive Failure:
@Dashrender said in RAID10 - Two Drive Failure:
@gjacobse said in RAID10 - Two Drive Failure:
@wirestyle22 said in RAID10 - Two Drive Failure:
@gjacobse said in RAID10 - Two Drive Failure:
@JaredBusch said in RAID10 - Two Drive Failure:
Predictive failure is not failure. Replace one at a time. to give the RAID card the most power to work on the individual resilver.
in my experience - you
never
replace more than one drive at a time...Ask me how I know.
That's very interesting. I have not really had to deal with drive failures actually.
I haven't kept up with the number of drive failures I have had over nearly 30 years. In all of my personal systems, I think I have had
one
.Work related,.. maybe all of three.
Wow, that's pretty small.
Personally, I've probably lost 3-4 drives. In businesses - well over 10.
And Scott has probably seen hundreds fail. Of course it all boils down to how many systems you see/support.
We have one or two fail every two-three months. Nothing crazy.
How many drives do you have?
A few hundred for now. Should be under 100 at the end of summer.
What are you guys changing to reduce that number by that much?
-
@wirestyle22 said in RAID10 - Two Drive Failure:
@coliver said in RAID10 - Two Drive Failure:
@Dashrender said in RAID10 - Two Drive Failure:
@coliver said in RAID10 - Two Drive Failure:
@Dashrender said in RAID10 - Two Drive Failure:
@gjacobse said in RAID10 - Two Drive Failure:
@wirestyle22 said in RAID10 - Two Drive Failure:
@gjacobse said in RAID10 - Two Drive Failure:
@JaredBusch said in RAID10 - Two Drive Failure:
Predictive failure is not failure. Replace one at a time. to give the RAID card the most power to work on the individual resilver.
in my experience - you
never
replace more than one drive at a time...Ask me how I know.
That's very interesting. I have not really had to deal with drive failures actually.
I haven't kept up with the number of drive failures I have had over nearly 30 years. In all of my personal systems, I think I have had
one
.Work related,.. maybe all of three.
Wow, that's pretty small.
Personally, I've probably lost 3-4 drives. In businesses - well over 10.
And Scott has probably seen hundreds fail. Of course it all boils down to how many systems you see/support.
We have one or two fail every two-three months. Nothing crazy.
How many drives do you have?
A few hundred for now. Should be under 100 at the end of summer.
What are you guys changing to reduce that number by that much?
Higher density drives.
-
@coliver said in RAID10 - Two Drive Failure:
@Dashrender said in RAID10 - Two Drive Failure:
@coliver said in RAID10 - Two Drive Failure:
@Dashrender said in RAID10 - Two Drive Failure:
@gjacobse said in RAID10 - Two Drive Failure:
@wirestyle22 said in RAID10 - Two Drive Failure:
@gjacobse said in RAID10 - Two Drive Failure:
@JaredBusch said in RAID10 - Two Drive Failure:
Predictive failure is not failure. Replace one at a time. to give the RAID card the most power to work on the individual resilver.
in my experience - you
never
replace more than one drive at a time...Ask me how I know.
That's very interesting. I have not really had to deal with drive failures actually.
I haven't kept up with the number of drive failures I have had over nearly 30 years. In all of my personal systems, I think I have had
one
.Work related,.. maybe all of three.
Wow, that's pretty small.
Personally, I've probably lost 3-4 drives. In businesses - well over 10.
And Scott has probably seen hundreds fail. Of course it all boils down to how many systems you see/support.
We have one or two fail every two-three months. Nothing crazy.
How many drives do you have?
A few hundred for now. Should be under 100 at the end of summer.
so you're losing around 1.5% of your drives per year... that seems a bit high, but my memory for the norm as published by google could be off. Plus your environment might not be as good as theirs.
-
@wirestyle22 said in RAID10 - Two Drive Failure:
@JaredBusch said in RAID10 - Two Drive Failure:
individual resilver.
Does this mean that the mileage is only applied to the new drive or it's just minimal in relation to the rest of the raid? Reason I ask is I always thought this put a lot of strain on the entire raid.
WTF? This is a nothing more than a single mirror pair. The "strain" here is only a copy operation. The least possible work.
The point of individual is because something like this is processed 100% by the CPU on the RAID card. So don't make it do more than one thing at a time.
A parity array is different.
-
@DustinB3403 said in RAID10 - Two Drive Failure:
@aaronstuder What raid controller do you have?
Exactly this. A real SMB system should be a hot plug. But we have no idea what you bought.
-
@gjacobse said in RAID10 - Two Drive Failure:
@JaredBusch said in RAID10 - Two Drive Failure:
Predictive failure is not failure. Replace one at a time. to give the RAID card the most power to work on the individual resilver.
in my experience - you
never
replace more than one drive at a time...Ask me how I know.
I solemnly swear that I've pulled the wrong drive to replace before . Made a RAID6 rebuild take a lot longer, and a RAID 10 freak out till a reboot happened. Restoring from backup was always an option at least.
-
@aaronstuder said in RAID10 - Two Drive Failure:
Drive are 1 and 3 are in "predictive failure" , I am assuming the pairs are 0+1 and 2+3.
Why?
-
@gjacobse said in RAID10 - Two Drive Failure:
@JaredBusch said in RAID10 - Two Drive Failure:
Predictive failure is not failure. Replace one at a time. to give the RAID card the most power to work on the individual resilver.
in my experience - you
never
replace more than one drive at a time...Ask me how I know.
In RAID 10, you always do if they are in different RAID 1 sets, always.
-
@scottalanmiller said in RAID10 - Two Drive Failure:
@aaronstuder said in RAID10 - Two Drive Failure:
Drive are 1 and 3 are in "predictive failure" , I am assuming the pairs are 0+1 and 2+3.
Why?
Why what? Assuming? Because he did not document and most hardware RAID controllers are not accessible except during the boot process.
-
@scottalanmiller said in RAID10 - Two Drive Failure:
@gjacobse said in RAID10 - Two Drive Failure:
@JaredBusch said in RAID10 - Two Drive Failure:
Predictive failure is not failure. Replace one at a time. to give the RAID card the most power to work on the individual resilver.
in my experience - you
never
replace more than one drive at a time...Ask me how I know.
In RAID 10, you always do if they are in different RAID 1 sets, always.
I completely disagree. Reason stated above.This is a predictive failure, not a failure. You will get a faster resilver of each mirror by doing them individually.
Of course I am assuming that the unit is in use and busy with normal system read/writes.
-
@JaredBusch said in RAID10 - Two Drive Failure:
I completely disagree. Reason stated above.This is a predictive failure, not a failure. You will get a faster resilver of each mirror by doing them individually.
Doesn't this put twice the amount of mileage on the array though? or no
-
@JaredBusch said in RAID10 - Two Drive Failure:
Predictive failure is not failure. Replace one at a time. to give the RAID card the most power to work on the individual resilver.
With mirrors RAID even the slowest RAID card won't feel the load of a straight copy. As long as they are not in the same RAID set, it'll be fastest and safest to do both at once.
-
@wirestyle22 said in RAID10 - Two Drive Failure:
@JaredBusch said in RAID10 - Two Drive Failure:
individual resilver.
Does this mean that the mileage is only applied to the new drive or it's just minimal in relation to the rest of the raid? Reason I ask is I always thought this put a lot of strain on the entire raid.
RAID 10 does not strain anything during a resilver, and the resilver operation only happens to a subset of the array, the overall array doesn't even know that it is happening.