New Infrastructure to Replace Scale Cluster



  • @dyasny said in New Infrastructure to Replace Scale Cluster:

    The better option, IMO, is to use two hosts as hypervisors, and the third - pack with disks, and use as the storage device (NFS or iSCSI). And also install the engine on it, as a VM or on baremetal - doesn't matter.

    Seems a waste. You lose a lot of performance with the networking overhead, you use three hosts for the job of two, and you give up HA. That's a lot of negative. Even if you already own the third host, doing an inverted pyramid of doom is the worst possible use of the existing resources. Better to retire the third host than to make it an anchor that will drown the other two nodes.



  • @Dashrender said in New Infrastructure to Replace Scale Cluster:

    @dyasny said in New Infrastructure to Replace Scale Cluster:

    @DustinB3403 no, in this particular setup, you have two options. The original one would be to go hyperconverged, installing both the storage and hypervisors services on all 3 hosts, and to also deploy the engine (vsphere equivalent) as a VM in the setup (that's called self hosted engine).

    The better option, IMO, is to use two hosts as hypervisors, and the third - pack with disks, and use as the storage device (NFS or iSCSI). And also install the engine on it, as a VM or on baremetal - doesn't matter.

    You will have less hypervisors, true, but having a storage service on the hypervisors is a resource drain, so you don't actually lose as much in terms of resources. And you gain a proper storage server, less management headache, and a setup that can scale nicely if you decide to add hypervisors or buy a real SAN. Performance will also be better, and you might even end up with more available disk space, because you will not have to keep 3 replicas of every byte like gluster/ceph require you to do.

    Isn't that an IPOD though?

    Correct, a standard three node IPOD.

    https://mangolassi.it/topic/8743/risk-single-server-versus-the-smallest-inverted-pyramid-design



  • @dyasny said in New Infrastructure to Replace Scale Cluster:

    And of course, we haven't even touched on HA.

    oVirt provides HA out of the box, as long as a living host has enough resources available to start the protected VMs.

    By definition, HA can't be provided "out of the box." HA is something you do, not something you buy. A product may have features to make HA easier, but a product itself can't do HA.

    In an IPOD, oVirt would simply automate LA (low availability). HA must be significantly higher than standard availability. The proposed IPOD design results in significantly lower than standard. (Where standard is an enterprise server with local storage and no system of this kind whatsoever.)



  • @JaredBusch said in New Infrastructure to Replace Scale Cluster:

    @Dashrender said in New Infrastructure to Replace Scale Cluster:

    This a term that Scott Allen Miller coined ages ago.

    No he didn't. Might be where you firs theard it, but it is not his.

    Actually, I did 🙂

    May, 2013. It's from a short article originally on SW, but was then codified in this article on the Inverted Pyramid of Doom on SMBITJournal.

    I actually did use it first (and second.) It's standard industry terminology now, but before 2013 it was only known as the 3-2-1 Architecture.



  • Here is the Origin of the Inverted Pyramid of Doom. In the thread, people even ask where it came from and it is mentioned that I had made it because the topic was so common and a name didn't exist for it yet.





  • @scottalanmiller said in New Infrastructure to Replace Scale Cluster:

    By definition, HA can't be provided "out of the box." HA is something you do, not something you buy. A product may have features to make HA easier, but a product itself can't do HA.

    In an IPOD, oVirt would simply automate LA (low availability). HA must be significantly higher than standard availability. The proposed IPOD design results in significantly lower than standard. (Where standard is an enterprise server with local storage and no system of this kind whatsoever.)

    This is just a bunch of terms you invented on the spot. No truth to them. HA can be provided "out of the box" by a system that is capable of it. It is not "something you do", it's a product or system feature.



  • @dyasny said in New Infrastructure to Replace Scale Cluster:

    This is just a bunch of terms you invented on the spot. No truth to th

    HA stands for "High Availability". It's not me inventing new terms. HA has always meant "high availability". Using HA to mean "something unrelated to availability" is the new invention in this case. Actual HA in the "IT terminology" rather than the marketing terminology can absolutely never be "purchased" as availability has to be measured by resulting risk, not feature.



  • @dyasny said in New Infrastructure to Replace Scale Cluster:

    It is not "something you do", it's a product or system feature.

    This is absolutely untrue in any technical, engineering, or business situation. In marketing and sales where the terms are fake, yes, HA is applied to absolutely anything. But that is never acceptable in an IT situation (or engineering, etc.)

    This the same tactic that many sales people use to try to use "redundancy" incorrectly.

    And it is @StorageNinja from VMware who said this.

    If you believe it is "something you buy", what would it be since you can buy that name on literally anything with no consistency in meaning. We know what the team means in IT. But you would need to define what it means to you for us to understand what you are thinking that it means. Since it can't be tied to redundancy (oVirt has none), nor to reliability or availability, I doubt anyone has a guess what would then make something be a purchasable HA feature.



  • For obvious reasons, HA can be applied in only one meaningful way... to mean "high availability." It's obvious and honest and useful. Any other use of it, to mean something unrelated to resulting availability rates, is totally meaningless - it's just empty words that there is never a reason to say.

    For example, a SAN with lower than average availability. Calling that HA means nothing, literally nothing. It doesn't mean redundant, it doesn't tell us anything about the availability, it doesn't tell us anything. It's just empty words slapped on something.

    oVirt provides no high availability, obviously. If it did, it would be trivial to demonstrate that you could buy and "switch on" that HA feature in a situation with extremely low availability. Saying therefore that "high availability" can mean "low availability" clearly makes no sense.



  • https://mangolassi.it/topic/10337/defining-high-availability/

    The only sources I can find that don't agree that high availability is defined by a relative measure of availability also say that DR and HA are overlapping.



  • @scottalanmiller you're obviously not inventing the term HA, but you are inventing this ridiculous saying about HA being what you do and not what you buy. You can, of course, hack HA into almost any service, but a product that is already built with HA in mind is something you buy and use as designed - and you get HA. Out of the box, if you bought and configured all the prerequisites. oVirt, vCenter, and a ton of other products have it designed into them, so if you pay for it, and for the hardware that supports it, you can have it right there out of the box, if you follow the setup guide. Everything else is just you throwing meaningless pronouncements in the air.

    You buy VMware, with the HA features (don't remember if those cost extra, doesn't matter here). You buy hardware that supports whatever VMWare uses for HA (IPMI/redfish/redundant switches etc - whatever is the best practice) and you follow the config guide to set it up - you have yourself highly available VMs, with all the standard properties for HA - downtime SLA, splitbrain avoidance etc etc. These are features you pay for (that's what "buy" means in the English language), both on the software and hardware side of things.

    And yes, I've decided arguing with you here is a huge waste of time, because for every comment you come back with 10, and I have no bandwidth for replying to that much, so if you think you "won" an argument or whatever tickles your fancy, sure, go ahead. I'll just answer if I want to, at my own convenience. Hope you don't mind.



  • @dyasny said in New Infrastructure to Replace Scale Cluster:

    @scottalanmiller you're obviously not inventing the term HA, but you are inventing this ridiculous saying about HA being what you do and not what you buy. You can, of course, hack HA into almost any service, but a product that is already built with HA in mind is something you buy and use as designed - and you get HA. Out of the box, if you bought and configured all the prerequisites. oVirt, vCenter, and a ton of other products have it designed into them, so if you pay for it, and for the hardware that supports it, you can have it right there out of the box, if you follow the setup guide. Everything else is just you throwing meaningless pronouncements in the air.

    You buy VMware, with the HA features (don't remember if those cost extra, doesn't matter here). You buy hardware that supports whatever VMWare uses for HA (IPMI/redfish/redundant switches etc - whatever is the best practice) and you follow the config guide to set it up - you have yourself highly available VMs, with all the standard properties for HA - downtime SLA, splitbrain avoidance etc etc. These are features you pay for (that's what "buy" means in the English language), both on the software and hardware side of things.

    And yes, I've decided arguing with you here is a huge waste of time, because for every comment you come back with 10, and I have no bandwidth for replying to that much, so if you think you "won" an argument or whatever tickles your fancy, sure, go ahead. I'll just answer if I want to, at my own convenience. Hope you don't mind.

    In general "buying" HA is generally though of as just by VMWare with the HA feature and tada you have HA. which of course is wrong. If you don't have redundant power and redundant switches, and storage, and HVAC, etc, etc - then you don't have real HA - you have one tiny piece that's HA, but you don't have HA for the the likely end goal.

    At least this is what I take from Scott's comments.

    VMWare won't sell you the solution that includes all the switches and internet and power and HVAC, etc, etc.. they will only sell you the things they sell. But HA requires so much more than they can provide.



  • @Dashrender said in New Infrastructure to Replace Scale Cluster:

    In general "buying" HA is generally though of as just by VMWare with the HA feature and tada you have HA. which of course is wrong. If you don't have redundant power and redundant switches, and storage, and HVAC, etc, etc - then you don't have real HA - you have one tiny piece that's HA, but you don't have HA for the the likely end goal.

    Everything you mentioned is something you can buy, and is usually specified as a prerequisite for an HA solution.

    At least this is what I take from Scott's comments.

    VMWare won't sell you the solution that includes all the switches and internet and power and HVAC, etc, etc.. they will only sell you the things they sell. But HA requires so much more than they can provide.

    I frankly don't remember the last time I actually bought anything from a vendor. People usually go to an integrator, and pay for building a solution for them. That integrator should provide all the prerequisites, and you pay for a solution not a single standalone product.

    Everyone wants to sell solutions, not products, to the point where these solutions in turn become productized. And an HA solution can be bought.

    Now, if we stop the pissing contest (yes Scott, you are the god of IT, you are always right and everyone else is always wrong, even when you don't drop your 2c in every conversation) and think rationally for a second here, we are talking about a small setup with just a few hosts. We already established we have brand name hardware, which means IPMI can be used as SBA. Out of the box with oVirt, this provides us with the ability to make VMs highly available, just configure the hosts, and mark the VMs as HA, nothing else. Assuming networking was done properly, and we have switch redundancy, and hoping the building has a generator and the storage is also reliable is nice, but out of scope for this particular scenario. All the OP wanted was VM HA, where if a VM dies or the host carrying it stops being able to host it, it gets safely started elsewhere. It isn't hard to grasp the concept, really.



  • @dyasny said in New Infrastructure to Replace Scale Cluster:

    @Dashrender said in New Infrastructure to Replace Scale Cluster:

    In general "buying" HA is generally though of as just by VMWare with the HA feature and tada you have HA. which of course is wrong. If you don't have redundant power and redundant switches, and storage, and HVAC, etc, etc - then you don't have real HA - you have one tiny piece that's HA, but you don't have HA for the the likely end goal.

    Everything you mentioned is something you can buy, and is usually specified as a prerequisite for an HA solution.

    At least this is what I take from Scott's comments.

    VMWare won't sell you the solution that includes all the switches and internet and power and HVAC, etc, etc.. they will only sell you the things they sell. But HA requires so much more than they can provide.

    I frankly don't remember the last time I actually bought anything from a vendor. People usually go to an integrator, and pay for building a solution for them. That integrator should provide all the prerequisites, and you pay for a solution not a single standalone product.

    Everyone wants to sell solutions, not products, to the point where these solutions in turn become productized. And an HA solution can be bought.

    Now, if we stop the pissing contest (yes Scott, you are the god of IT, you are always right and everyone else is always wrong, even when you don't drop your 2c in every conversation) and think rationally for a second here, we are talking about a small setup with just a few hosts. We already established we have brand name hardware, which means IPMI can be used as SBA. Out of the box with oVirt, this provides us with the ability to make VMs highly available, just configure the hosts, and mark the VMs as HA, nothing else. Assuming networking was done properly, and we have switch redundancy, and hoping the building has a generator and the storage is also reliable is nice, but out of scope for this particular scenario. All the OP wanted was VM HA, where if a VM dies or the host carrying it stops being able to host it, it gets safely started elsewhere. It isn't hard to grasp the concept, really.

    Started? That's not HA. At least at this very moment I wouldn't consider it HA, I'd consider it SA (Standard Availability).

    As for your reasoning that people buy solutions - oh, if only that were true more often than not. But one look at Spice--- oh you know that place, you can see that people buy a SAN and think they have HA. period.

    Also to more points you made - You simply can't have HA without having ALL of those other parts. It's great that you have HA at the server level, but what are the chances that's where your issue is going to be and not at the electrical power level? It's great that the servers have HA, but if your internet doesn't is the solution really HA?
    No - it's not.

    It's sudo HA. or more likely - simply SA or in the worst setup - LA.



  • @Dashrender said in New Infrastructure to Replace Scale Cluster:

    Started? That's not HA. At least at this very moment I wouldn't consider it HA, I'd consider it SA (Standard Availability).

    It doesn't matter what you consider. Automatic monitoring of a service and making sure it always runs (so if it stops running, the system starts it) is the definition of HA. Nobody ever promised 0 downtime, just a lot of 9's after the dot. Don't confuse HA with FT

    As for your reasoning that people buy solutions - oh, if only that were true more often than not. But one look at Spice--- oh you know that place, you can see that people buy a SAN and think they have HA. period.

    I don't care what people think, there are standards and definitions available.

    Also to more points you made - You simply can't have HA without having ALL of those other parts. It's great that you have HA at the server level, but what are the chances that's where your issue is going to be and not at the electrical power level? It's great that the servers have HA, but if your internet doesn't is the solution really HA?
    No - it's not.

    HA is about avoiding a failure. The best HA solutions target all possible points of failure, and how many of those you want to cover is up to you and your budget. No solution is ever perfect, even the solutions that are meant to address solution imperfections 🙂



  • @dyasny said in New Infrastructure to Replace Scale Cluster:

    It doesn't matter what you consider. Automatic monitoring of a service and making sure it always runs (so if it stops running, the system starts it) is the definition of HA. Nobody ever promised 0 downtime, just a lot of 9's after the dot. Don't confuse HA with FT

    That might be a way to achieve HA, but it isn't the definition of it. The definition of it comes from the level of availability, nothing to do with the mechanisms or attempts at it. You are correct that people often confuse HA (a level of availability) with FT (a physical type of protection whose purpose is to help provide HA, or at least "better A".)



  • @dyasny said in New Infrastructure to Replace Scale Cluster:

    you're obviously not inventing the term HA, but you are inventing this ridiculous saying about HA being what you do and not what you buy.

    No, I've clearly quoted the source on that every time. It's @StorageNinja

    But it is also incredibly true. You can't "just buy" HA, doesn't work. No product anywhere can do HA if you don't treat it properly or make the things around it support HA as well. It doesn't require "inventing anything", it's just obvious common sense.



  • @dyasny said in New Infrastructure to Replace Scale Cluster:

    Everyone wants to sell solutions, not products, to the point where these solutions in turn become productized. And an HA solution can be bought.

    Sure, but that is 1) almost always untrue, very few vendors actually do anything to achieve HA 2) when they do, they are becoming the IT department and building, not buying, HA.

    Basically what you are doing is agreeing with John, but playing a services semantics game to look at it from a CEO's perspective that he can "buy an IT department that will be tasked with implementing HA". So if you consider your staffing, hiring, and business decisions to be "something you can buy like a product", then sure. But all you've done is define "anything you do" as something purchasable.



  • @scottalanmiller said in New Infrastructure to Replace Scale Cluster:

    But all you've done is define "anything you do" as something purchasable.

    To be the devil's advocate

    So you can pay someone to breath for you? To flush the toilet for you? To wipe for you?



  • @DustinB3403 said in New Infrastructure to Replace Scale Cluster:

    @scottalanmiller said in New Infrastructure to Replace Scale Cluster:

    But all you've done is define "anything you do" as something purchasable.

    To be the devil's advocate

    So you can pay someone to breath for you? To flush the toilet for you? To wipe for you?

    Yes, yes, and yes.



  • @JaredBusch said in New Infrastructure to Replace Scale Cluster:

    @DustinB3403 said in New Infrastructure to Replace Scale Cluster:

    @scottalanmiller said in New Infrastructure to Replace Scale Cluster:

    But all you've done is define "anything you do" as something purchasable.

    To be the devil's advocate

    So you can pay someone to breath for you? To flush the toilet for you? To wipe for you?

    Yes, yes, and yes.

    Well now I know where my paycheck is going to!



  • @dyasny said in New Infrastructure to Replace Scale Cluster:

    Out of the box with oVirt, this provides us with the ability to make VMs highly available, just configure the hosts, and mark the VMs as HA, nothing else. Assuming networking was done properly, and we have switch redundancy, and hoping the building has a generator and the storage is also reliable is nice, but out of scope for this particular scenario. All the OP wanted was VM HA, where if a VM dies or the host carrying it stops being able to host it, it gets safely started elsewhere. It isn't hard to grasp the concept, really.

    See, this is where it all falls apart. Your example proves our point. In your example with the IPOD, you aren't even trying to make HA. oVirt's HA option is actually terrible here because it tricks the humans into thinking there is protection where there is not and makes people often (where makes = allows their brains to accept) introduce more risk, lowing their availability below standard, by seeing the term HA applies to one isolated layer of the stack and ignoring the increased risk of the overall stack.

    It's not just an example of how HA fails, it is THE example that we've been talking about for a decade. It's the "buy the book" worst possible setup where you add lots of hardware, add "HA products", and the result is a system that is slower and more fragile than if you hadn't done it at all. It's the exact scenario we did the risk analysis on a few years ago, the one that has been beaten to death.

    If you feel the math is wrong, go after that. But you are just repeating the identical arguments that people on 🌶 always did and doing the same "not looking where the problem is" and instead looking at the term HA and ignoring that to turn it on you had to create a huge amount of risk that wasn't there originally. It's all the standard marketing trick.

    There is no pissing match, this is tried and true, obvious and well known problems with HA marketing. It's an example that is trivial to show proves the point (and has been, SO many times.) And the only thing going on is you are making a fresh argument for something that was long ago shown to be LA and acting like we don't all already know this. Your statements about it suggest that you either haven't read about it and haven't examined the risk in whole, or you are knowingly ignoring the body of work on it and thinking that if you pretend the evidence hasn't been presented that rehashing it will cause us to forget where the risk is.

    Why you are arguing the point in this way doesn't make sense. Because you aren't stating where the math is wrong, you are just ignoring that math has been presented over and over again and no one in a decade has ever ended up disagreeing with the evidence. So it is not how to present a logical argument.



  • @DustinB3403 said in New Infrastructure to Replace Scale Cluster:

    @scottalanmiller said in New Infrastructure to Replace Scale Cluster:

    But all you've done is define "anything you do" as something purchasable.

    To be the devil's advocate

    So you can pay someone to breath for you? To flush the toilet for you? To wipe for you?

    And you can then call it something "you buy" rather than "something you do".



  • @DustinB3403 said in New Infrastructure to Replace Scale Cluster:

    @JaredBusch said in New Infrastructure to Replace Scale Cluster:

    @DustinB3403 said in New Infrastructure to Replace Scale Cluster:

    @scottalanmiller said in New Infrastructure to Replace Scale Cluster:

    But all you've done is define "anything you do" as something purchasable.

    To be the devil's advocate

    So you can pay someone to breath for you? To flush the toilet for you? To wipe for you?

    Yes, yes, and yes.

    Well now I know where my paycheck is going to!

    To be used as toilet paper?



  • @scottalanmiller said in New Infrastructure to Replace Scale Cluster:

    @DustinB3403 said in New Infrastructure to Replace Scale Cluster:

    @JaredBusch said in New Infrastructure to Replace Scale Cluster:

    @DustinB3403 said in New Infrastructure to Replace Scale Cluster:

    @scottalanmiller said in New Infrastructure to Replace Scale Cluster:

    But all you've done is define "anything you do" as something purchasable.

    To be the devil's advocate

    So you can pay someone to breath for you? To flush the toilet for you? To wipe for you?

    Yes, yes, and yes.

    Well now I know where my paycheck is going to!

    To be used as toilet paper?

    To pay people to do things that I do myself!



  • @dyasny said in New Infrastructure to Replace Scale Cluster:

    I frankly don't remember the last time I actually bought anything from a vendor. People usually go to an integrator, and pay for building a solution for them.

    Integrator is industry speak for the vendor advocate. When anyone in IT says vendor, they mean integrator. It's just accepted that they are the channel arm for their vendors and to the IT side they are one and the same. Both are vendor advocates, both are sales people, one just repeats the marketing of the other.



  • @scottalanmiller said in New Infrastructure to Replace Scale Cluster:

    @DustinB3403 said in New Infrastructure to Replace Scale Cluster:

    @JaredBusch said in New Infrastructure to Replace Scale Cluster:

    @DustinB3403 said in New Infrastructure to Replace Scale Cluster:

    @scottalanmiller said in New Infrastructure to Replace Scale Cluster:

    But all you've done is define "anything you do" as something purchasable.

    To be the devil's advocate

    So you can pay someone to breath for you? To flush the toilet for you? To wipe for you?

    Yes, yes, and yes.

    Well now I know where my paycheck is going to!

    To be used as toilet paper?

    LOL - nice!!
    I see what you did there.



  • The marketing trick being used here is "shifted risk." It's a "slight of hands" way to make something look like HA and it is the most common approach used to sell "HA equipment" that isn't actually HA. It's popular because it provides and affordable setup, at great margins, and high cost, but not so high as to break the customer's bank. But it cost quite a bit more than a true HA setup, and so is popular as a way for vendors (and integrators) to raise the costs without raising them beyond acceptable limits.

    The trick is that they identify the risks of the initial system. They call it "the app server" to begin the trick. In a single server setup, this isn't correct, the single server is the "entire physical system", you can't isolate one part of it like that, but they do and hence the trick begins.

    Next they say "how do we eliminate this risk"? Answer: make it redundant. Another slight of hand, redundant means "an extra thing" not "it is fault tolerant" but the average person thinks that they must mean the later and hears that in their head regardless of what is actually said. Of course this is wrong, because the risk is of the entire system, not just one piece, but they only offered to make one piece redundant. In theory, everyone should know that the thing is a trick at this point, but people always want to give the benefit of the doubt because hearing that "HA can be bought" triggers an emotional reaction and makes us hope that it is really so easy and automatic that it requires nothing from us.

    Next the big risks, the storage and other system components are shifted out of the single box and put somewhere else with a "no need to look over there, it's magic" attitude and people happily agree that since storage is hard, they won't look any further. Every integrator pulls this trick, this is where the money is. So people assume that since everyone does it, it must be right.

    Then the integrator adds in more risk by needing switches. But they, again, simply raise the cost by saying it needs to be redundant and since it is redundant, ignore that it carries risks too.

    The end result is the integrator gets to sell not just the two servers the customer didn't need (because if they bought this, they clearly didn't need HA at all), but also a third server, and two extra switches, and a lot of set up man power. The vendor piece of the equation is thrilled because they easily double (or more) their hardware sales. And the integrator piece is thrilled because they didn't just double their margins but they also use the unneeded complexity to push for a lot of integrator hours and ongoing support because they made something that was best left simple, into something really complex.

    And all of this by somehow convincing people that "risk isn't additive", which is amazing that people can be tricked of that. The storage server, at the end of the day, carries literally all of the risk of the original server and isn't removed from the risk pool in any way. If we had kept our eye on the storage component of the original setup instead of being tricked by being directed to the application piece, it all becomes really obvious, really quickly. That's how the card trick is played - look at this motion while I hide the card you were trying to watch.

    Then, once we take our eyes off of the "unaddressed initial risk", all of the other risk, the application servers, the switches, the cabling... is all "extra risk layered on top of the unaddressed initial risk." And math is math. More risk is riskier than less risk. It's literally that simple.



  • This IPOD setup, and the vendors integrators that sell it we called "the standard scam of SMB IT" a year ago.

    Youtube Video