ML
    • Recent
    • Categories
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    How To Replace a Failed Drive in Hardware RAID

    IT Discussion
    blind swap hardware raid raid hot swap storage best practices how to
    2
    2
    1.3k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • scottalanmillerS
      scottalanmiller
      last edited by

      Assumptions before we begin:

      • You have hardware RAID (SmartArray, LSI, Adaptec, PERC, MegaRAID, etc.)
      • You have hot swap (every enterprise server is hot swap, but it is possible to configure without in some cases.)

      What we have here, while not naturally intrinsic to hardware RAID but as all manufacturers do this exclusively we get to make the connection, is called Blind Hot Swap. This makes our lives very easy. This how to applies to essentially all standard servers configured normally.

      Once a drive has failed this is what we do, and this is all that we do:

      1. Identify the failed drive, normal from a light indicator on the front of the drive slot
      2. Get replacement drive
      3. Remove the failed drive
      4. Insert the replacement drive into the same slot
      5. Wait for the lights to tell you that all is healthy

      At no point should we need access to the hypervisor or operating system, we need nothing except the replacement parts and access to the server itself. Often in large enterprises this process is performed by datacenter staff, not by systems administrators as this is purely a hardware task and requires no IT knowledge or interaction. The system identifies what is wrong and handles all of the repair on its own.

      Absolutely do not power down a system in a state with a failed drive. This puts undo stress on the RAID array and increases risk.

      A server that is repairing (resilvering) its array can be used as normal as it should operate as normal, only more slowly. However while under use the RAID array will not resilver at optimum speed. If you wish to speed the repair process you should reduce the workload of the RAID array as much as possible.

      1 Reply Last reply Reply Quote 6
      • LakshmanaL
        Lakshmana
        last edited by

        i have done this process for one server ya

        1 Reply Last reply Reply Quote 1
        • 1 / 1
        • First post
          Last post