ML
    • Recent
    • Categories
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Hot Swap vs. Blind Swap

    Scheduled Pinned Locked Moved Announcements
    storageraidhot swapblind swapcold swap
    66 Posts 10 Posters 29.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • drewlanderD
      drewlander @BRRABill
      last edited by

      @BRRABill said:

      My drive failed almost immediately. I mean, whatever happened rebooted the server.

      Go right ahead. Did that drive fail after replacement while it was in a degraded state? Id say your controller is failing if that happened.

      On a side note, I pretty much only use RAID 1 mirror w 1 hot spare (3 disks total) these days in what I do. The apps I deal with and code for (mostly) are OLTP with tons of tiny write transactions. Using a small stripe size and only two disks, this setup benchmarks 13x faster write speeds for me than a RAID5 array with 4 disks, all day, according to AS SSD. The way we coded our software and designed the database everything uses GUID's for PK. GoDaddy premium dns provides round-robin load balancing ( I don't manage that part). In Proliant servers (dl360 G7 for example) I like to install both backplane kits and split the RAID1 mirror between backplanes. This is just to show as example that there's really not a one-size-fits-all solution for server configurations and redundancy. The software I develop (or run) dictates what I am able to do with the hardware.

      scottalanmillerS BRRABillB 2 Replies Last reply Reply Quote 1
      • scottalanmillerS
        scottalanmiller @drewlander
        last edited by

        @drewlander said:

        On a side note, I pretty much only use RAID 1 mirror w 1 hot spare (3 disks total) these days in what I do.

        Never use a hot spare with RAID 1 unless your controller really lacks basic functionality. Instead go to a triple mirrored RAID 1. This is far safer than RAID 1 with a hot spare because instead of needing to rebuild while lacking mirroring the data is always hot and ready AND you get a 50% read performance boost for the life of the array. So faster and safer, no downsides.

        drewlanderD 1 Reply Last reply Reply Quote 3
        • JaredBuschJ
          JaredBusch @Jason
          last edited by

          @Jason said:

          @BRRABill said:

          P.S. If anyone can read that, and it DOESN'T say good luck, please don't let me know. 🙂

          @JaredBusch might know.

          The Japanese meaning for that is spring when used by itself. compounded with other kanji, the meanign could change.

          Chinese reads the kanji differently. No idea on that.

          1 Reply Last reply Reply Quote 0
          • BRRABillB
            BRRABill @drewlander
            last edited by

            @drewlander said:

            Go right ahead. Did that drive fail after replacement while it was in a degraded state? Id say your controller is failing if that happened.

            There were 4 drives.

            1 2 3 4

            2 was degraded/failed. I took it out, and put in a fresh one. The server then rebooted, and both 1 and 2 showed up as failed when it came back up.

            scottalanmillerS 1 Reply Last reply Reply Quote 0
            • scottalanmillerS
              scottalanmiller @BRRABill
              last edited by

              @BRRABill said:

              @drewlander said:

              Go right ahead. Did that drive fail after replacement while it was in a degraded state? Id say your controller is failing if that happened.

              There were 4 drives.

              1 2 3 4

              2 was degraded/failed. I took it out, and put in a fresh one. The server then rebooted, and both 1 and 2 showed up as failed when it came back up.

              It would have rebooted because the other drive failed or else it means that the server had failed on its own.

              1 Reply Last reply Reply Quote 1
              • BRRABillB
                BRRABill
                last edited by

                No matter. I am on a new machine now with a new drive. Neither server grade, but all temporary. Probably safer.

                All to be written up some day soon. I had to go into work today on my day off (with the two kids in tow who LOVED IT (for real)) for non-IT stuff.

                I'm now having beer and watching the Jets/Bills game.

                1 Reply Last reply Reply Quote 0
                • drewlanderD
                  drewlander @scottalanmiller
                  last edited by

                  @scottalanmiller said:

                  @drewlander said:

                  On a side note, I pretty much only use RAID 1 mirror w 1 hot spare (3 disks total) these days in what I do.

                  Never use a hot spare with RAID 1 unless your controller really lacks basic functionality. Instead go to a triple mirrored RAID 1. This is far safer than RAID 1 with a hot spare because instead of needing to rebuild while lacking mirroring the data is always hot and ready AND you get a 50% read performance boost for the life of the array. So faster and safer, no downsides.

                  That's a pretty strong leading sentence. I want that spare inactive because the servers run SSD. Also I am not sure the gains on reads would be worth the hit on writes in an OLTP app that processes high volume micro transactions. We both know HDD's read faster than they write, and reads are not generally where people suffer with disk I/O issues (at least not in what I do). Id be happy to try it and compare random writes on a RAID1 3-way mirror cs RAID1 2 disk mirror, but I don't think I even need to do that to know 3x random writes takes longer than 2x random writes. Rebuild in degraded mode would be slower, but I would sooner prefer generally faster transactions with a day of slow rebuilding over a generally slower application from day to day.

                  😜

                  -d

                  scottalanmillerS 4 Replies Last reply Reply Quote 1
                  • scottalanmillerS
                    scottalanmiller @drewlander
                    last edited by

                    @drewlander said:

                    That's a pretty strong leading sentence.

                    It should be. Hot spares use all of the electrical and HVAC and incur all of the cost of having a full mirror but give up the performance boost and carry the resilver time that are unnecessary. It's two different uses of the same resources. It's the same rule, more or less, as to why you always do RAID 6 rather than RAID 5 plus hot spare. Faster, safer, same cost and same capacity.

                    1 Reply Last reply Reply Quote 0
                    • scottalanmillerS
                      scottalanmiller @drewlander
                      last edited by

                      @drewlander said:

                      Also I am not sure the gains on reads would be worth the hit on writes in an OLTP app that processes high volume micro transactions. We both know HDD's read faster than they write, and reads are not generally where people suffer with disk I/O issues (at least not in what I do).

                      Regardless of where they suffer, the reads are 50% faster. This means both that any read operation is simply that much faster (unless you are saying that you literally have no bottleneck on your storage at all, which seems very unlikely) and read operations take less time leaving more available time for write transactions so that there is less opportunity for contention.

                      It is certainly not as beneficial as if writes were faster too, but it is a non-trivial boost in read performance while getting better safety as well and read performance aids in write performance in any case where both need to be performance which, because of how write work, is nearly always to some extent.

                      1 Reply Last reply Reply Quote 0
                      • scottalanmillerS
                        scottalanmiller @drewlander
                        last edited by

                        @drewlander said:

                        I want that spare inactive because the servers run SSD.

                        You left out why you feel this would alter the rule of thumb. Is it because you feel that your write transactions are so heavy that you are looking at killing the SSDs through writes?

                        1 Reply Last reply Reply Quote 0
                        • scottalanmillerS
                          scottalanmiller @drewlander
                          last edited by

                          @drewlander said:

                          Id be happy to try it and compare random writes on a RAID1 3-way mirror cs RAID1 2 disk mirror, but I don't think I even need to do that to know 3x random writes takes longer than 2x random writes. Rebuild in degraded mode would be slower, but I would sooner prefer generally faster transactions with a day of slow rebuilding over a generally slower application from day to day.

                          RAID 1 Two Way is: 2x Read Speed, 1x Write Speed
                          RAID 1 Three Way is: 3x Read Speed, 1x Write Speed

                          This is very basic RAID math, there is literally zero write penalty by adding more mirrored disks - that's why this is an "always" statement. There are no caveats, only benefits. This isn't one of those "likely tradeoff" situations. Just as RAID 1 double has zero write penalty over a single disk, triple or quadruple or whatever has no penalty over it either.

                          https://www.storagecraft.com/blog/raid-performance/

                          drewlanderD 1 Reply Last reply Reply Quote 1
                          • drewlanderD
                            drewlander @scottalanmiller
                            last edited by drewlander

                            @scottalanmiller said:

                            y zero write penalty

                            Sorry about the late response but I had some deadlines to meet and couldn't be distracted. In response:

                            Yes, I am concerned about disk endurance. That is why the disk exists at all.

                            I will agree that there is no write penalty if the cache does not get backlogged. The write cache is bypassing the write-through process where the disks tell the host that the write is complete. The cache still has to write the data to all the disks. Unless you can show me how an inequality of 3 < 2 is true, it is slower to write to three disks than two. I would surmise then with three disks the cache can get backlogged faster than with two disks because it has to deal with 33% more more writes before dumping that data from cache, which is entirely plausible in a high volume random write environment like OLTP systems.

                            Now since you got me thinking about this it has brought something to my attention that might be pretty important. The disk read fifo queue and disk write fifo queue are not necessarily in sync because the queues are not combined. This is true with or without a cache present, but negative implications could be much more prevailing with a cache present. When I commit a write transaction that gets stuck in cache and a read request is sent immediately after to retrieve that data, then it is theoretically possible the data I am expecting might not exist on the disk yet because its still in cache. Yikes!

                            I guess the point is, the configuration is entirely circumstantial. If I was serving web pages all day and not storing tons of micro data, then faster reads would be useful.

                            MattSpellerM scottalanmillerS 5 Replies Last reply Reply Quote 0
                            • MattSpellerM
                              MattSpeller @drewlander
                              last edited by MattSpeller

                              @drewlander said:

                              I will agree that there is no write penalty if the cache does not get backlogged. The write cache is bypassing the write-through process where the disks tell the host that the write is complete. The cache still has to write the data to all the disks. Unless you can show me how an inequality of 3 < 2 is true, it is slower to write to three disks than two. I would surmise then with three disks the cache can get backlogged faster than with two disks because it has to deal with 33% more more writes before dumping that data from cache, which is entirely plausible in a high volume random write environment like OLTP systems.

                              Initial throat clearing: asking questions to learn more

                              Why would writing out to infinite disks in RAID1 be different than one disk?

                              Why would write cache be any different on the HDD (make it clog faster?) between 2 and 3 drives?

                              drewlanderD 1 Reply Last reply Reply Quote 1
                              • drewlanderD
                                drewlander @MattSpeller
                                last edited by

                                @MattSpeller Based on what you just said I finally understand where @scottalanmiller is coming from. I concede that the writes are simultaneous to the disks therefore should not backlog the cache with exception in that queues and seeks will be based on your slowest disk.

                                1 Reply Last reply Reply Quote 2
                                • scottalanmillerS
                                  scottalanmiller @drewlander
                                  last edited by

                                  @drewlander said:

                                  Yes, I am concerned about disk endurance. That is why the disk exists at all.

                                  Okay, that would make sense as a concern. Do you plan to do enough writes for that to be a problem? Normally SSDs even under decent load are looking at decades of writes before writes become an issue.

                                  1 Reply Last reply Reply Quote 0
                                  • scottalanmillerS
                                    scottalanmiller @drewlander
                                    last edited by

                                    @drewlander said:

                                    I will agree that there is no write penalty if the cache does not get backlogged. The write cache is bypassing the write-through process where the disks tell the host that the write is complete. The cache still has to write the data to all the disks. Unless you can show me how an inequality of 3 < 2 is true, it is slower to write to three disks than two.

                                    RAID 1 is mirrored, all disks are written simultaneously, not sequentially (unless your RAID controller is that crappy in which case you have other issues.) All disks write together, they don't sit around idle waiting for the others to finish.

                                    1 Reply Last reply Reply Quote 0
                                    • scottalanmillerS
                                      scottalanmiller @drewlander
                                      last edited by

                                      @drewlander said:

                                      I would surmise then with three disks the cache can get backlogged faster than with two disks because it has to deal with 33% more more writes before dumping that data from cache, which is entirely plausible in a high volume random write environment like OLTP systems.

                                      Cache concerns remain the same, the cache would contain one copy of the data to be written no matter how many mirror members there are and would send a copy to each mirror member at the same time - from a write cache perspective, RAID 1 looks like a single drive - one copy stored, one copy sent. Speed looks identical to a single drive.

                                      1 Reply Last reply Reply Quote 0
                                      • scottalanmillerS
                                        scottalanmiller @drewlander
                                        last edited by

                                        @drewlander said:

                                        I guess the point is, the configuration is entirely circumstantial. If I was serving web pages all day and not storing tons of micro data, then faster reads would be useful.

                                        it is purely a theoretical case where reads are not useful. There are pure read systems, there are mixed use, there are write heavy systems but there really aren't any real world use cases for a pure write system. You are always reading the data sometime.

                                        drewlanderD 1 Reply Last reply Reply Quote 0
                                        • drewlanderD
                                          drewlander @scottalanmiller
                                          last edited by

                                          @scottalanmiller said:

                                          concerns remain the same, t

                                          Any experience with LSI fastpath using SSD arrays as cache on the front end for a larger array of SAS spinning drives on the back end? LSI says its super fast, but that is their job to tell me that. I have been debating implementing one server like this to see how it goes.

                                          drewlanderD 1 Reply Last reply Reply Quote 1
                                          • scottalanmillerS
                                            scottalanmiller
                                            last edited by

                                            Windows software solution, AFAIK it's dead now that virtualization has taken over. It was some bizarre and foolish approach using software to control the hardware. Bad idea.

                                            0_1448057461606_fastpath.png

                                            The CacheCade approach was much more sound.

                                            1 Reply Last reply Reply Quote 2
                                            • 1
                                            • 2
                                            • 3
                                            • 4
                                            • 3 / 4
                                            • First post
                                              Last post