ML
    • Recent
    • Categories
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    AWS Catastrophic Data Loss

    IT Discussion
    12
    76
    3.8k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • DustinB3403D
      DustinB3403 @PhlipElder
      last edited by

      @PhlipElder said in AWS Catastrophic Data Loss:

      Most of our clients have had 100% up-time across solution sets for years and in some cases we're coming up on decades.

      Really, decades of uptime. Not a single bad ram module, raid failure, CPU, PSU or MB issue. No site issues (fire, earthquake, tornado etc) in all that time.

      @PhlipElder said in AWS Catastrophic Data Loss:

      Cloud can't touch that. Period.

      You're full of it.

      PhlipElderP 1 Reply Last reply Reply Quote 0
      • PhlipElderP
        PhlipElder @DustinB3403
        last edited by

        @DustinB3403 said in AWS Catastrophic Data Loss:

        @PhlipElder said in AWS Catastrophic Data Loss:

        Most of our clients have had 100% up-time across solution sets for years and in some cases we're coming up on decades.

        Really, decades of uptime. Not a single bad ram module, raid failure, CPU, PSU or MB issue. No site issues (fire, earthquake, tornado etc) in all that time.

        @PhlipElder said in AWS Catastrophic Data Loss:

        Cloud can't touch that. Period.

        You're full of it.

        I'm quite proud of our record. It's a testament to the amount of time and money put in to research, proof, and thrash the solution sets we've sold over the years. We don't sell anything we first don't proof.

        DustinB3403D 1 Reply Last reply Reply Quote 0
        • DustinB3403D
          DustinB3403 @PhlipElder
          last edited by

          @PhlipElder said in AWS Catastrophic Data Loss:

          @DustinB3403 said in AWS Catastrophic Data Loss:

          @PhlipElder said in AWS Catastrophic Data Loss:

          Most of our clients have had 100% up-time across solution sets for years and in some cases we're coming up on decades.

          Really, decades of uptime. Not a single bad ram module, raid failure, CPU, PSU or MB issue. No site issues (fire, earthquake, tornado etc) in all that time.

          @PhlipElder said in AWS Catastrophic Data Loss:

          Cloud can't touch that. Period.

          You're full of it.

          I'm quite proud of our record. It's a testament to the amount of time and money put in to research, proof, and thrash the solution sets we've sold over the years. We don't sell anything we first don't proof.

          So you're using technology that is at least a decade old for every one of your customers, because by your own word you can't possibly have had the time to test anything from this year and sold it to a customer!

          PhlipElderP 1 Reply Last reply Reply Quote -1
          • dbeatoD
            dbeato @PhlipElder
            last edited by

            @PhlipElder said in AWS Catastrophic Data Loss:

            @dbeato said in AWS Catastrophic Data Loss:

            @PhlipElder said in AWS Catastrophic Data Loss:

            @Dashrender said in AWS Catastrophic Data Loss:

            @BRRABill said in AWS Catastrophic Data Loss:

            because the chances that MS's DC is going to blow up is extremely small

            And yet, it is what this thread is about ... exactly that happening.

            Except that it's Amazon, not MS.

            MS was US Central this year or late last.

            MS was the world when their authentication mechanism went down I think it was a year or so ago.

            MS was Europe offline with VMs hosed and a recovery needed. Weeks.

            MS has had plenty of trials by fire.

            Not one of the hyper-scale folks are trouble free.

            Most of our clients have had 100% up-time across solution sets for years and in some cases we're coming up on decades. Cloud can't touch that. Period.

            And no updates correct right? to have 100 % Up-time you must never do updates.

            In a cluster setting, not too difficult. In this case, 100% up-time is defined as nary a user impacted by any service or app being offline when needed.

            So, point of clarification conceded.

            Yes, I know you could do a cluster and that's how Cloud Providers give you that 99.9% up-time or SLA. Right now it is hard to believe no one has any issues, if cloud providers in a large scale have issues then smaller companies do have them as well. That said, no cloud provider provides any backups for anyone unless you set them up either through their offering or your company.

            IRJI 1 Reply Last reply Reply Quote 1
            • PhlipElderP
              PhlipElder @DustinB3403
              last edited by

              @DustinB3403 said in AWS Catastrophic Data Loss:

              @PhlipElder said in AWS Catastrophic Data Loss:

              @DustinB3403 said in AWS Catastrophic Data Loss:

              @PhlipElder said in AWS Catastrophic Data Loss:

              Most of our clients have had 100% up-time across solution sets for years and in some cases we're coming up on decades.

              Really, decades of uptime. Not a single bad ram module, raid failure, CPU, PSU or MB issue. No site issues (fire, earthquake, tornado etc) in all that time.

              @PhlipElder said in AWS Catastrophic Data Loss:

              Cloud can't touch that. Period.

              You're full of it.

              I'm quite proud of our record. It's a testament to the amount of time and money put in to research, proof, and thrash the solution sets we've sold over the years. We don't sell anything we first don't proof.

              So you're using technology that is at least a decade old for every one of your customers, because by your own word you can't possibly have had the time to test anything from this year and sold it to a customer!

              Not sure how that conclusion came about but far from it.

              We've had plenty of NDAs over the years to proof with upcoming tech so that we're on the right page and current.

              DustinB3403D 1 Reply Last reply Reply Quote 1
              • DustinB3403D
                DustinB3403 @PhlipElder
                last edited by

                @PhlipElder said in AWS Catastrophic Data Loss:

                @DustinB3403 said in AWS Catastrophic Data Loss:

                @PhlipElder said in AWS Catastrophic Data Loss:

                @DustinB3403 said in AWS Catastrophic Data Loss:

                @PhlipElder said in AWS Catastrophic Data Loss:

                Most of our clients have had 100% up-time across solution sets for years and in some cases we're coming up on decades.

                Really, decades of uptime. Not a single bad ram module, raid failure, CPU, PSU or MB issue. No site issues (fire, earthquake, tornado etc) in all that time.

                @PhlipElder said in AWS Catastrophic Data Loss:

                Cloud can't touch that. Period.

                You're full of it.

                I'm quite proud of our record. It's a testament to the amount of time and money put in to research, proof, and thrash the solution sets we've sold over the years. We don't sell anything we first don't proof.

                So you're using technology that is at least a decade old for every one of your customers, because by your own word you can't possibly have had the time to test anything from this year and sold it to a customer!

                Not sure how that conclusion came about but far from it.

                We've had plenty of NDAs over the years to proof with upcoming tech so that we're on the right page and current.

                You've said you've tested everything that you sell. How could this possibly be true to make claims of decades worth of up-time. Power supplies fail, switches die, disks die, MB's die, sites lose power (which people still have jobs to do - just because the lights are out. . .)

                So you're still full of it. Not to mention performing any update will eventually require a restart. Windows updates, file server migrations etc. All require some downtime.

                DashrenderD 1 Reply Last reply Reply Quote -1
                • DashrenderD
                  Dashrender @DustinB3403
                  last edited by

                  @DustinB3403 said in AWS Catastrophic Data Loss:

                  @PhlipElder said in AWS Catastrophic Data Loss:

                  @DustinB3403 said in AWS Catastrophic Data Loss:

                  @PhlipElder said in AWS Catastrophic Data Loss:

                  @DustinB3403 said in AWS Catastrophic Data Loss:

                  @PhlipElder said in AWS Catastrophic Data Loss:

                  Most of our clients have had 100% up-time across solution sets for years and in some cases we're coming up on decades.

                  Really, decades of uptime. Not a single bad ram module, raid failure, CPU, PSU or MB issue. No site issues (fire, earthquake, tornado etc) in all that time.

                  @PhlipElder said in AWS Catastrophic Data Loss:

                  Cloud can't touch that. Period.

                  You're full of it.

                  I'm quite proud of our record. It's a testament to the amount of time and money put in to research, proof, and thrash the solution sets we've sold over the years. We don't sell anything we first don't proof.

                  So you're using technology that is at least a decade old for every one of your customers, because by your own word you can't possibly have had the time to test anything from this year and sold it to a customer!

                  Not sure how that conclusion came about but far from it.

                  We've had plenty of NDAs over the years to proof with upcoming tech so that we're on the right page and current.

                  You've said you've tested everything that you sell. How could this possibly be true to make claims of decades worth of up-time. Power supplies fail, switches die, disks die, MB's die, sites lose power (which people still have jobs to do - just because the lights are out. . .)

                  So you're still full of it. Not to mention performing any update will eventually require a restart. Windows updates, file server migrations etc. All require some downtime.

                  all of those things can fail - as long as they have an HA solution that accounts for those failures.

                  As he said earlier - the customer has NEVER been impacted - that's the point of measurement.

                  PhlipElderP 1 Reply Last reply Reply Quote 1
                  • IRJI
                    IRJ
                    last edited by

                    Adding this graphic again...

                    The data is on the customer!

                    14044fc6-ad7e-44e6-8d15-5198dac3e0b6-image.png

                    1 1 Reply Last reply Reply Quote 1
                    • 1
                      1337 @IRJ
                      last edited by 1337

                      @IRJ said in AWS Catastrophic Data Loss:

                      Adding this graphic again...

                      The data is on the customer!

                      Where is that table from?

                      I'm just wondering, not disputing it 🙂

                      IRJI 1 Reply Last reply Reply Quote 0
                      • IRJI
                        IRJ @dbeato
                        last edited by

                        @dbeato said in AWS Catastrophic Data Loss:

                        @PhlipElder said in AWS Catastrophic Data Loss:

                        @dbeato said in AWS Catastrophic Data Loss:

                        @PhlipElder said in AWS Catastrophic Data Loss:

                        @Dashrender said in AWS Catastrophic Data Loss:

                        @BRRABill said in AWS Catastrophic Data Loss:

                        because the chances that MS's DC is going to blow up is extremely small

                        And yet, it is what this thread is about ... exactly that happening.

                        Except that it's Amazon, not MS.

                        MS was US Central this year or late last.

                        MS was the world when their authentication mechanism went down I think it was a year or so ago.

                        MS was Europe offline with VMs hosed and a recovery needed. Weeks.

                        MS has had plenty of trials by fire.

                        Not one of the hyper-scale folks are trouble free.

                        Most of our clients have had 100% up-time across solution sets for years and in some cases we're coming up on decades. Cloud can't touch that. Period.

                        And no updates correct right? to have 100 % Up-time you must never do updates.

                        In a cluster setting, not too difficult. In this case, 100% up-time is defined as nary a user impacted by any service or app being offline when needed.

                        So, point of clarification conceded.

                        Yes, I know you could do a cluster and that's how Cloud Providers give you that 99.9% up-time or SLA. Right now it is hard to believe no one has any issues, if cloud providers in a large scale have issues then smaller companies do have them as well. That said, no cloud provider provides any backups for anyone unless you set them up either through their offering or your company.

                        Yeah and you can only fault yourself, if you are one AZ that fails. Most serious deployments are in different regions as well.

                        1 1 Reply Last reply Reply Quote 0
                        • 1
                          1337 @IRJ
                          last edited by

                          @IRJ said in AWS Catastrophic Data Loss:

                          @dbeato said in AWS Catastrophic Data Loss:

                          @PhlipElder said in AWS Catastrophic Data Loss:

                          @dbeato said in AWS Catastrophic Data Loss:

                          @PhlipElder said in AWS Catastrophic Data Loss:

                          @Dashrender said in AWS Catastrophic Data Loss:

                          @BRRABill said in AWS Catastrophic Data Loss:

                          because the chances that MS's DC is going to blow up is extremely small

                          And yet, it is what this thread is about ... exactly that happening.

                          Except that it's Amazon, not MS.

                          MS was US Central this year or late last.

                          MS was the world when their authentication mechanism went down I think it was a year or so ago.

                          MS was Europe offline with VMs hosed and a recovery needed. Weeks.

                          MS has had plenty of trials by fire.

                          Not one of the hyper-scale folks are trouble free.

                          Most of our clients have had 100% up-time across solution sets for years and in some cases we're coming up on decades. Cloud can't touch that. Period.

                          And no updates correct right? to have 100 % Up-time you must never do updates.

                          In a cluster setting, not too difficult. In this case, 100% up-time is defined as nary a user impacted by any service or app being offline when needed.

                          So, point of clarification conceded.

                          Yes, I know you could do a cluster and that's how Cloud Providers give you that 99.9% up-time or SLA. Right now it is hard to believe no one has any issues, if cloud providers in a large scale have issues then smaller companies do have them as well. That said, no cloud provider provides any backups for anyone unless you set them up either through their offering or your company.

                          Yeah and you can only fault yourself, if you are one AZ that fails. Most serious deployments are in different regions as well.

                          Well, except that:

                          @Pete-S said in AWS Catastrophic Data Loss:

                          As we have further investigated this event with our customers, we have discovered a few isolated cases where customers' applications running across multiple Availability Zones saw unexpected impact

                          IRJI 2 Replies Last reply Reply Quote 0
                          • IRJI
                            IRJ @1337
                            last edited by

                            @Pete-S said in AWS Catastrophic Data Loss:

                            @IRJ said in AWS Catastrophic Data Loss:

                            Adding this graphic again...

                            The data is on the customer!

                            Where is that table from?

                            I'm just wondering, not disputing it 🙂

                            I learned this while doing my Cloud Security Cert. As you can see all major cloud providers follow this model as this was set by CSA (Cloud Security Alliance) as proper customer responsibility.

                            https://pen-testing.sans.org/blog/2012/07/05/pen-testing-in-the-cloud

                            https://aws.amazon.com/compliance/shared-responsibility-model/

                            https://gallery.technet.microsoft.com/Shared-Responsibilities-81d0ff91

                            1 1 Reply Last reply Reply Quote 0
                            • IRJI
                              IRJ @1337
                              last edited by

                              @Pete-S said in AWS Catastrophic Data Loss:

                              @IRJ said in AWS Catastrophic Data Loss:

                              @dbeato said in AWS Catastrophic Data Loss:

                              @PhlipElder said in AWS Catastrophic Data Loss:

                              @dbeato said in AWS Catastrophic Data Loss:

                              @PhlipElder said in AWS Catastrophic Data Loss:

                              @Dashrender said in AWS Catastrophic Data Loss:

                              @BRRABill said in AWS Catastrophic Data Loss:

                              because the chances that MS's DC is going to blow up is extremely small

                              And yet, it is what this thread is about ... exactly that happening.

                              Except that it's Amazon, not MS.

                              MS was US Central this year or late last.

                              MS was the world when their authentication mechanism went down I think it was a year or so ago.

                              MS was Europe offline with VMs hosed and a recovery needed. Weeks.

                              MS has had plenty of trials by fire.

                              Not one of the hyper-scale folks are trouble free.

                              Most of our clients have had 100% up-time across solution sets for years and in some cases we're coming up on decades. Cloud can't touch that. Period.

                              And no updates correct right? to have 100 % Up-time you must never do updates.

                              In a cluster setting, not too difficult. In this case, 100% up-time is defined as nary a user impacted by any service or app being offline when needed.

                              So, point of clarification conceded.

                              Yes, I know you could do a cluster and that's how Cloud Providers give you that 99.9% up-time or SLA. Right now it is hard to believe no one has any issues, if cloud providers in a large scale have issues then smaller companies do have them as well. That said, no cloud provider provides any backups for anyone unless you set them up either through their offering or your company.

                              Yeah and you can only fault yourself, if you are one AZ that fails. Most serious deployments are in different regions as well.

                              Well, except that:

                              @Pete-S said in AWS Catastrophic Data Loss:

                              As we have further investigated this event with our customers, we have discovered a few isolated cases where customers' applications running across multiple Availability Zones saw unexpected impact

                              They didnt say regions though 😉

                              1 Reply Last reply Reply Quote -1
                              • IRJI
                                IRJ @1337
                                last edited by

                                @Pete-S said in AWS Catastrophic Data Loss:

                                @IRJ said in AWS Catastrophic Data Loss:

                                @dbeato said in AWS Catastrophic Data Loss:

                                @PhlipElder said in AWS Catastrophic Data Loss:

                                @dbeato said in AWS Catastrophic Data Loss:

                                @PhlipElder said in AWS Catastrophic Data Loss:

                                @Dashrender said in AWS Catastrophic Data Loss:

                                @BRRABill said in AWS Catastrophic Data Loss:

                                because the chances that MS's DC is going to blow up is extremely small

                                And yet, it is what this thread is about ... exactly that happening.

                                Except that it's Amazon, not MS.

                                MS was US Central this year or late last.

                                MS was the world when their authentication mechanism went down I think it was a year or so ago.

                                MS was Europe offline with VMs hosed and a recovery needed. Weeks.

                                MS has had plenty of trials by fire.

                                Not one of the hyper-scale folks are trouble free.

                                Most of our clients have had 100% up-time across solution sets for years and in some cases we're coming up on decades. Cloud can't touch that. Period.

                                And no updates correct right? to have 100 % Up-time you must never do updates.

                                In a cluster setting, not too difficult. In this case, 100% up-time is defined as nary a user impacted by any service or app being offline when needed.

                                So, point of clarification conceded.

                                Yes, I know you could do a cluster and that's how Cloud Providers give you that 99.9% up-time or SLA. Right now it is hard to believe no one has any issues, if cloud providers in a large scale have issues then smaller companies do have them as well. That said, no cloud provider provides any backups for anyone unless you set them up either through their offering or your company.

                                Yeah and you can only fault yourself, if you are one AZ that fails. Most serious deployments are in different regions as well.

                                Well, except that:

                                @Pete-S said in AWS Catastrophic Data Loss:

                                As we have further investigated this event with our customers, we have discovered a few isolated cases where customers' applications running across multiple Availability Zones saw unexpected impact

                                At some point, you have to be willing to accept some risks by by not using a different region, generally the risk is VERY, VERY low which is why many customers use AZs.

                                You have to do risk anaylsis, and see how often these events occur and how likely you would be to be one of the "few" that were impacted.

                                You can dig in the weeds all you want, but across multiple regions this wouldnt have happened. Which is true HA

                                1 1 Reply Last reply Reply Quote -1
                                • 1
                                  1337 @IRJ
                                  last edited by

                                  @IRJ said in AWS Catastrophic Data Loss:

                                  @Pete-S said in AWS Catastrophic Data Loss:

                                  @IRJ said in AWS Catastrophic Data Loss:

                                  @dbeato said in AWS Catastrophic Data Loss:

                                  @PhlipElder said in AWS Catastrophic Data Loss:

                                  @dbeato said in AWS Catastrophic Data Loss:

                                  @PhlipElder said in AWS Catastrophic Data Loss:

                                  @Dashrender said in AWS Catastrophic Data Loss:

                                  @BRRABill said in AWS Catastrophic Data Loss:

                                  because the chances that MS's DC is going to blow up is extremely small

                                  And yet, it is what this thread is about ... exactly that happening.

                                  Except that it's Amazon, not MS.

                                  MS was US Central this year or late last.

                                  MS was the world when their authentication mechanism went down I think it was a year or so ago.

                                  MS was Europe offline with VMs hosed and a recovery needed. Weeks.

                                  MS has had plenty of trials by fire.

                                  Not one of the hyper-scale folks are trouble free.

                                  Most of our clients have had 100% up-time across solution sets for years and in some cases we're coming up on decades. Cloud can't touch that. Period.

                                  And no updates correct right? to have 100 % Up-time you must never do updates.

                                  In a cluster setting, not too difficult. In this case, 100% up-time is defined as nary a user impacted by any service or app being offline when needed.

                                  So, point of clarification conceded.

                                  Yes, I know you could do a cluster and that's how Cloud Providers give you that 99.9% up-time or SLA. Right now it is hard to believe no one has any issues, if cloud providers in a large scale have issues then smaller companies do have them as well. That said, no cloud provider provides any backups for anyone unless you set them up either through their offering or your company.

                                  Yeah and you can only fault yourself, if you are one AZ that fails. Most serious deployments are in different regions as well.

                                  Well, except that:

                                  @Pete-S said in AWS Catastrophic Data Loss:

                                  As we have further investigated this event with our customers, we have discovered a few isolated cases where customers' applications running across multiple Availability Zones saw unexpected impact

                                  At some point, you have to be willing to accept some risks by by not using a different region, generally the risk is VERY, VERY low which is why many customers use AZs.

                                  You have to do risk anaylsis, and see how often these events occur and how likely you would be to be one of the "few" that were impacted.

                                  You can dig in the weeds all you want, but across multiple regions this wouldnt have happened. Which is true HA

                                  Well, different regions wouldn't be enough for true HA. You'd need different cloud providers as well.

                                  Otherwise you have something called common mode failure. Which is for instance that they are running on the same architecture, maybe even the same hardware and as such could be susceptible to a single problem that will affect the entire cloud.

                                  scottalanmillerS Emad RE 2 Replies Last reply Reply Quote 1
                                  • 1
                                    1337 @IRJ
                                    last edited by

                                    @IRJ said in AWS Catastrophic Data Loss:

                                    @Pete-S said in AWS Catastrophic Data Loss:

                                    @IRJ said in AWS Catastrophic Data Loss:

                                    Adding this graphic again...

                                    The data is on the customer!

                                    Where is that table from?

                                    I'm just wondering, not disputing it 🙂

                                    I learned this while doing my Cloud Security Cert. As you can see all major cloud providers follow this model as this was set by CSA (Cloud Security Alliance) as proper customer responsibility.

                                    https://pen-testing.sans.org/blog/2012/07/05/pen-testing-in-the-cloud

                                    https://aws.amazon.com/compliance/shared-responsibility-model/

                                    https://gallery.technet.microsoft.com/Shared-Responsibilities-81d0ff91

                                    Awesome, thanks!

                                    1 Reply Last reply Reply Quote 0
                                    • PhlipElderP
                                      PhlipElder @Dashrender
                                      last edited by PhlipElder

                                      @Dashrender said in AWS Catastrophic Data Loss:

                                      @DustinB3403 said in AWS Catastrophic Data Loss:

                                      @PhlipElder said in AWS Catastrophic Data Loss:

                                      @DustinB3403 said in AWS Catastrophic Data Loss:

                                      @PhlipElder said in AWS Catastrophic Data Loss:

                                      @DustinB3403 said in AWS Catastrophic Data Loss:

                                      @PhlipElder said in AWS Catastrophic Data Loss:

                                      Most of our clients have had 100% up-time across solution sets for years and in some cases we're coming up on decades.

                                      Really, decades of uptime. Not a single bad ram module, raid failure, CPU, PSU or MB issue. No site issues (fire, earthquake, tornado etc) in all that time.

                                      @PhlipElder said in AWS Catastrophic Data Loss:

                                      Cloud can't touch that. Period.

                                      You're full of it.

                                      I'm quite proud of our record. It's a testament to the amount of time and money put in to research, proof, and thrash the solution sets we've sold over the years. We don't sell anything we first don't proof.

                                      So you're using technology that is at least a decade old for every one of your customers, because by your own word you can't possibly have had the time to test anything from this year and sold it to a customer!

                                      Not sure how that conclusion came about but far from it.

                                      We've had plenty of NDAs over the years to proof with upcoming tech so that we're on the right page and current.

                                      You've said you've tested everything that you sell. How could this possibly be true to make claims of decades worth of up-time. Power supplies fail, switches die, disks die, MB's die, sites lose power (which people still have jobs to do - just because the lights are out. . .)

                                      So you're still full of it. Not to mention performing any update will eventually require a restart. Windows updates, file server migrations etc. All require some downtime.

                                      all of those things can fail - as long as they have an HA solution that accounts for those failures.

                                      As he said earlier - the customer has NEVER been impacted - that's the point of measurement.

                                      Thank you sir. 🙂

                                      This one is relatively recent:
                                      http://blog.mpecsinc.ca/2018/06/our-calgary-oil-gas-show-booth-slide.html

                                      This is one of our PoC sets: http://blog.mpecsinc.ca/2018/01/storage-spaces-direct-s2d-sizing-east.html

                                      Systems we built on the current generation before now (had me wires crossed):
                                      http://blog.mpecsinc.ca/2017/11/intel-server-system-r2224wftzs.html

                                      A half Petabyte setup: https://www.youtube.com/watch?v=OKnRzEgHHKA

                                      At our peak working with these we had three of them here in the shop: https://www.youtube.com/watch?v=26U6pDsdz5M&t=321s

                                      Drove the neighbours crazy with the jet engine sounds coming out of here. Plenty of Ear Defenders to be had. 😉

                                      That help?

                                      EDIT: Any guesses on the cost for the four node S2D setup with Mellanox 40GbE RDMA dual switch fabric?

                                      PhlipElderP 1 Reply Last reply Reply Quote 0
                                      • PhlipElderP
                                        PhlipElder @PhlipElder
                                        last edited by

                                        @PhlipElder said in AWS Catastrophic Data Loss:

                                        @Dashrender said in AWS Catastrophic Data Loss:

                                        @DustinB3403 said in AWS Catastrophic Data Loss:

                                        @PhlipElder said in AWS Catastrophic Data Loss:

                                        @DustinB3403 said in AWS Catastrophic Data Loss:

                                        @PhlipElder said in AWS Catastrophic Data Loss:

                                        @DustinB3403 said in AWS Catastrophic Data Loss:

                                        @PhlipElder said in AWS Catastrophic Data Loss:

                                        Most of our clients have had 100% up-time across solution sets for years and in some cases we're coming up on decades.

                                        Really, decades of uptime. Not a single bad ram module, raid failure, CPU, PSU or MB issue. No site issues (fire, earthquake, tornado etc) in all that time.

                                        @PhlipElder said in AWS Catastrophic Data Loss:

                                        Cloud can't touch that. Period.

                                        You're full of it.

                                        I'm quite proud of our record. It's a testament to the amount of time and money put in to research, proof, and thrash the solution sets we've sold over the years. We don't sell anything we first don't proof.

                                        So you're using technology that is at least a decade old for every one of your customers, because by your own word you can't possibly have had the time to test anything from this year and sold it to a customer!

                                        Not sure how that conclusion came about but far from it.

                                        We've had plenty of NDAs over the years to proof with upcoming tech so that we're on the right page and current.

                                        You've said you've tested everything that you sell. How could this possibly be true to make claims of decades worth of up-time. Power supplies fail, switches die, disks die, MB's die, sites lose power (which people still have jobs to do - just because the lights are out. . .)

                                        So you're still full of it. Not to mention performing any update will eventually require a restart. Windows updates, file server migrations etc. All require some downtime.

                                        all of those things can fail - as long as they have an HA solution that accounts for those failures.

                                        As he said earlier - the customer has NEVER been impacted - that's the point of measurement.

                                        Thank you sir. 🙂

                                        This one is relatively recent:
                                        http://blog.mpecsinc.ca/2018/06/our-calgary-oil-gas-show-booth-slide.html

                                        This is one of our PoC sets: http://blog.mpecsinc.ca/2018/01/storage-spaces-direct-s2d-sizing-east.html

                                        Systems we built on the current generation before now (had me wires crossed):
                                        http://blog.mpecsinc.ca/2017/11/intel-server-system-r2224wftzs.html

                                        A half Petabyte setup: https://www.youtube.com/watch?v=OKnRzEgHHKA

                                        At our peak working with these we had three of them here in the shop: https://www.youtube.com/watch?v=26U6pDsdz5M&t=321s

                                        Drove the neighbours crazy with the jet engine sounds coming out of here. Plenty of Ear Defenders to be had. 😉

                                        That help?

                                        EDIT: Any guesses on the cost for the four node S2D setup with Mellanox 40GbE RDMA dual switch fabric?

                                        This post is a bit dated. But it states clearly, and concisely, exactly where we're at as far as investing in our folks here:

                                        http://blog.mpecsinc.ca/2016/11/whats-in-lab-profit.html

                                        My attitude is simple: If we ain't learning we're effing sh#t up.

                                        1 Reply Last reply Reply Quote 1
                                        • scottalanmillerS
                                          scottalanmiller @1337
                                          last edited by

                                          @Pete-S said in AWS Catastrophic Data Loss:

                                          Well, different regions wouldn't be enough for true HA. You'd need different cloud providers as well.

                                          HA has to do with the uptime, not the amount of redundancy. Different regions from AWS is definitely way more than enough for HA by any standard. You can do HA with a single datacenter, just not an AWS datacenter. But lots of single datacenters provide HA at a facility level.

                                          But your app has to support the multiple datacenter model. That's what is really hard for most people.

                                          1 Reply Last reply Reply Quote 0
                                          • BRRABillB
                                            BRRABill
                                            last edited by

                                            @scottalanmiller are you purposely avoiding the "major cloud provider didn't have its own backups" discussion?

                                            scottalanmillerS 1 Reply Last reply Reply Quote 0
                                            • 1
                                            • 2
                                            • 3
                                            • 4
                                            • 4 / 4
                                            • First post
                                              Last post