How Many HCI Nodes for the SMB
-
So to sort-of come back around to the initial question, it's also going to depend on your position on N+1. Is it a nice to have or a must have? We're technically running our main VDI workload on HCI but if I'm not mistaken are commited beyond the N+1 threshhold. For our case it's not the end of the world, just means that if we lose a node there are some people who won't be able to work until we can either get the node back up or move on to some older gear. If you're running mission critical servers on HCI then I'd say of course that 2 is a minimum but that could depend on how the solution is engineered. I haven't looked into the various options but I could see a 3 node minimum requirement to satisfy quorum needs or avoid a split-brain scenario where the solution tries to spin up the VMs on the spare node when they haven't really gone down, just a communication glitch....
So like most things, the real answer is "It depends"
-
@notverypunny said in How Many HCI Nodes for the SMB:
So to sort-of come back around to the initial question, it's also going to depend on your position on N+1. Is it a nice to have or a must have? We're technically running our main VDI workload on HCI but if I'm not mistaken are commited beyond the N+1 threshhold.
The implication of two nodes is that it is still N+1. You just buy bigger nodes if necessary to keep it to two nodes.
-
@notverypunny said in How Many HCI Nodes for the SMB:
I haven't looked into the various options but I could see a 3 node minimum requirement to satisfy quorum needs or avoid a split-brain scenario where the solution tries to spin up the VMs on the spare node when they haven't really gone down, just a communication glitch....
You don't need three nodes for that. You can use a witness and there is technology to make it pretty much unnecessary even so.
-
@scottalanmiller said in How Many HCI Nodes for the SMB:
@Pete-S You state the common lay person myth that clock frequency is how CPU performance is measured. You then state that modern processors are barely faster than old ones that had dramatically less advanced technology. You gave literally zero basis for this statement, you just pulled it out of thin air without even a hint of reasoning for it.
I provided actual MIPS calculations pulled from chip measurements (IPC/s are derived from MIPS on the chips.) You can argue that there are better measurements. But your argument seems to be solely something you made up based on nothing, and claiming that the math and measurements are wrong because "gamers", which is weird because gamers have always been the non-technical people that tend to use the clock cycle numbers because they are "easy" and nothing to do with CPU performance.
Do you have a reason you believe this is wrong? Or are you sticking to "because I said so" and trying to ignore the math and measurements, common sense and long term industry knowledge (and simple observation.)No, clock frequency isn't a measure of performance. I've never said that anywhere.
No, the CPUs you mentioned earlier are desktop CPUs and not something you'll see in a server. That's why it's irrelevant.
IPC is also as irrelevant as clock frequency in itself and I'll explain why below.Also when I talk about servers CPUs from 10 years and forward it's Xeon 5500/5600 series, E5-2600 V1/V2/V3/V4, Scalable gen1/gen2, AMD Epyc gen1/gen2. Servers you see in 1U or 2U server such as Dell R710 and newer.
Looking at maximum number of cores per CPU the last 10 years you'll see:
- 5600 - 6 cores
- E5-2600 V1 - 8c
- E5-2600 V2 - 10c (12c in special SKU)
- E5-2600 V3 - 18c
- E5-2600 V4 - 22c
- Scalable Gen 1 - 28c
- Epyc gen 1 - 32c
- Scalable Gen 2 - 56c (but not readily available)
- Epyc Rome gen 2 - 64c
If we would for a moment assume the CPUs had exactly the same cores at the same clock frequency, the increase in core count would be more than 10 times. So a server today could have more than 10 times the processing power.
The good thing is that cores can do more work today at the same clock frequency. The bad thing is that due to the thermal design envelope you can't run a high core CPU at high frequency or it will burn up. So we are NOT running at the same frequency. And that's where the problem lies.
Looking at the sales info from Intel & AMD you would have thought that each new CPU would be a tremendous improvement. But as any sales info it only tells part of the truth and each measurement is made in very specific situations to show the largest possible improvement. As anyone should expect.
But if you run a generic benchmark that is not designed to give inflated numbers, the situation is different.
For instance comparing X5690 I mentioned before as the pinnacle of CPU performance about 10 years ago:
https://browser.geekbench.com/processors/intel-xeon-x5690
to AMD's 7742 64 core monster CPU:
https://browser.geekbench.com/processors/amd-epyc-7742Looking at the single core benchmark (which is a mix of running different computations) we'll see that the new cores in this case only has 12% more performance than the 10 year old cores.
We can argue what this benchmark is measuring all day long. If the newer CPU would have executed the instructions faster it would have had a better result. It's as simple as that.
This is not an outlier or freak benchmark, there are hundreds of these. Some will show that the newer cores are maybe 75% faster while other might even show that newer cores are not faster at all.
The largest improvement comes when the new CPU has some new instructions that can help in some cases for instance AVX-512.
Now, you can get CPUs with cores that are significantly faster because they run a higher frequencies, but as I said those CPUs are not available with as many cores. Because they would burn up.
-
@scottalanmiller said in How Many HCI Nodes for the SMB:
The implication of two nodes is that it is still N+1. You just buy bigger nodes if necessary to keep it to two nodes.
If your licensing Oracle RAC for 40K per core (list, I know you'll pay less but still) or SAP HANA (where you pay per TB of RAM) then scaling out to a larger cluster has some advantages on N+1 math where 50% vs. 25% on 4 smaller nodes for HA protection comes to play.
-
@StorageNinja said in How Many HCI Nodes for the SMB:
@scottalanmiller said in How Many HCI Nodes for the SMB:
The implication of two nodes is that it is still N+1. You just buy bigger nodes if necessary to keep it to two nodes.
If your licensing Oracle RAC for 40K per core (list, I know you'll pay less but still) or SAP HANA (where you pay per TB of RAM) then scaling out to a larger cluster has some advantages on N+1 math where 50% vs. 25% on 4 smaller nodes for HA protection comes to play.
How many SMBs actually use Oracle RAC or SAP HANA? Can't be many.
-
@travisdh1 said in How Many HCI Nodes for the SMB:
How many SMBs actually use Oracle RAC or SAP HANA? Can't be many.
I know people with 20 employees and 400 oracle databases FWIW. There's a lot of smaller application providers who do niche SaaS stuff.
SAP is pulling Oracle support, and making everyone move to HANA going forward for their apps.
-
@Pete-S said in How Many HCI Nodes for the SMB:
But if you run a generic benchmark that is not designed to give inflated numbers, the situation is different.
I don't run generic benchmarks in production for a living thankfully
The reality is most CPU intensive stuff takes advantage of at least some of the new offload extensions and libraries. Also memory throughput is often the limiting factor for databases and other intensive IO applications.