ML
    • Recent
    • Categories
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    60k IOPS Spike

    IT Discussion
    3
    16
    562
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • DonahueD
      Donahue
      last edited by

      Let me start by saying that I am rerunning the test with a longer duration to see if my initial results are an anomaly. But in the meantime, I would like some speculation. I ran Dell live optics a while back, twice. The first test was for only 10 minutes and the second test was for 12 hours. Apparently I had only been looking at the results of the 10 minute test and I didn't pay attention to one specific part of the 12 hour test.

      On the 12 hour test, there was an initial spike of IOPS from one of our datastores that is an 8x250 raid 10 array with SSD's. The spike lasted ~30 minutes and had a peak of just over 60k read IOPS. It started right after the test was started. What can cause something like this? Can the test itself cause results this high?

      It may be a coincidence, but while this test was being done, our main database VM that sits on this array corrupted itself to the point that I had to restore from a backup from before this test. Could that somehow be responsible for the spike, or could the test have caused the corruption? I hit start right before leaving for the night and when I came in the next morning, the VM said it had no OS disk.

      I would upload a screenshot of the live optics, but ML is giving me an upload error.

      scottalanmillerS 1 Reply Last reply Reply Quote 2
      • scottalanmillerS
        scottalanmiller
        last edited by

        If you upload to Imgur or similar service and get a link, you can link the hosted image here for now till the plugin gets fixed.

        1 Reply Last reply Reply Quote 0
        • scottalanmillerS
          scottalanmiller @Donahue
          last edited by

          @Donahue said in 60k IOPS Spike:

          It may be a coincidence, but while this test was being done, our main database VM that sits on this array corrupted itself to the point that I had to restore from a backup from before this test. Could that somehow be responsible for the spike, or could the test have caused the corruption? I hit start right before leaving for the night and when I came in the next morning, the VM said it had no OS disk.

          A newly loaded database can definitely cause some incredible spikes. It might have been re-indexing or loading into RAM during that time. Really intensive operations.

          Rule of thumb for Live Optics is two weeks.

          1 Reply Last reply Reply Quote 1
          • DonahueD
            Donahue
            last edited by

            there is also a hard page fault spike that corresponds to the same time period, nothing else is on that graph. Maybe I will run a 2 week one after finishing this 24 hour one I started today.

            looking into imgur now.

            scottalanmillerS 1 Reply Last reply Reply Quote 0
            • scottalanmillerS
              scottalanmiller @Donahue
              last edited by

              @Donahue said in 60k IOPS Spike:

              there is also a hard page fault spike that corresponds to the same time period, nothing else is on that graph. Maybe I will run a 2 week one after finishing this 24 hour one I started today.

              looking into imgur now.

              Page faults must trigger IOPS, the two are linked.

              1 Reply Last reply Reply Quote 0
              • DashrenderD
                Dashrender
                last edited by

                Isn't the test itself passive? Just monitoring the IOPs in usage, not actually trying to cause the system to use high IOPs, right?

                scottalanmillerS 1 Reply Last reply Reply Quote 0
                • scottalanmillerS
                  scottalanmiller @Dashrender
                  last edited by

                  @Dashrender said in 60k IOPS Spike:

                  Isn't the test itself passive? Just monitoring the IOPs in usage, not actually trying to cause the system to use high IOPs, right?

                  The theory is that the impact of the test is minimal.

                  DashrenderD 1 Reply Last reply Reply Quote 0
                  • DonahueD
                    Donahue
                    last edited by Donahue

                    alt text
                    alt text
                    alt text

                    Edit: The button for url pictures doesnt seem to work either.
                    https://imgur.com/GvyQjFR
                    https://imgur.com/5As19Pa
                    https://imgur.com/nKZDM5h

                    scottalanmillerS 1 Reply Last reply Reply Quote 0
                    • scottalanmillerS
                      scottalanmiller @Donahue
                      last edited by

                      @Donahue said in 60k IOPS Spike:

                      alt text
                      alt text
                      alt text

                      Edit: The button for url pictures doesnt seem to work either.
                      https://imgur.com/GvyQjFR
                      https://imgur.com/5As19Pa
                      https://imgur.com/nKZDM5h

                      Those aren't pictures, those are web pages. You have to link the image itself When doing so, you don't need the image button, just the image link will do the trick.

                      1 Reply Last reply Reply Quote 0
                      • DonahueD
                        Donahue
                        last edited by

                        https://i.imgur.com/GvyQjFR.png
                        https://i.imgur.com/5As19Pa.png
                        https://i.imgur.com/nKZDM5h.png

                        1 Reply Last reply Reply Quote 0
                        • DashrenderD
                          Dashrender @scottalanmiller
                          last edited by

                          @scottalanmiller said in 60k IOPS Spike:

                          @Dashrender said in 60k IOPS Spike:

                          Isn't the test itself passive? Just monitoring the IOPs in usage, not actually trying to cause the system to use high IOPs, right?

                          The theory is that the impact of the test is minimal.

                          Is it even really a test, or simply a monitoring. The use of the term 'test' implies to me that LiveOptics itself is testing something, is it? I seriously don't know.

                          scottalanmillerS 1 Reply Last reply Reply Quote 0
                          • scottalanmillerS
                            scottalanmiller @Dashrender
                            last edited by

                            @Dashrender said in 60k IOPS Spike:

                            @scottalanmiller said in 60k IOPS Spike:

                            @Dashrender said in 60k IOPS Spike:

                            Isn't the test itself passive? Just monitoring the IOPs in usage, not actually trying to cause the system to use high IOPs, right?

                            The theory is that the impact of the test is minimal.

                            Is it even really a test, or simply a monitoring. The use of the term 'test' implies to me that LiveOptics itself is testing something, is it? I seriously don't know.

                            Just monitoring.

                            1 Reply Last reply Reply Quote 0
                            • DonahueD
                              Donahue
                              last edited by

                              I kind of think that the spike has thrown off the average so even the 95% percentile is wrong. But what would cause it to last for so long?

                              scottalanmillerS 1 Reply Last reply Reply Quote 0
                              • scottalanmillerS
                                scottalanmiller @Donahue
                                last edited by

                                @Donahue said in 60k IOPS Spike:

                                I kind of think that the spike has thrown off the average so even the 95% percentile is wrong. But what would cause it to last for so long?

                                "So long" is only 20 minutes. That's nothing. I have spikes longer than that just for a patch cycle.

                                1 Reply Last reply Reply Quote 0
                                • DonahueD
                                  Donahue
                                  last edited by

                                  As far as I can tell, there was nothing going on during that time frame other than the test.

                                  1 Reply Last reply Reply Quote 0
                                  • DonahueD
                                    Donahue
                                    last edited by

                                    Well, there was nothing in the 24 hour test. I have started another one for 7 days, which is the longest option in the current version.

                                    1 Reply Last reply Reply Quote 0
                                    • 1 / 1
                                    • First post
                                      Last post