Geekbench observations



  • I've made some test running Geekbench benchmarking to see what kind of scores you will get when the CPU is under heavy load with VMs that require a lot more resources than what's available.

    I used up to 16 VMs with 4 vCPU each and ran Geekbench on them at the same time. CPU was 6-core Xeon E5-2630V2, which is what Azure A-series uses.

    The results, the way I interpret them, is that Geekbench results isn't really an accurate performance indicator. The most accurate way to detect cloud VMs where the provider has overcommited resources is to use the multi-core score as it is a better indicator of the actual capacity degradation.

    geekbench_scores_summary.png
    geekbench_scores_summary_table.png



  • The relationship between the single-core and multi-core score should be about 80% of theoretical max on the multi-core score.

    So if single core score is 3000 and you have 4 vCPUs then multi-core score should be 80% of 3000 x 4 cores = 9600. If the host is under heavy load the multi-core score will go lower and lower.



  • @Pete-S said in Geekbench observations:

    The relationship between the single-core and multi-core score should be about 80% of theoretical max on the multi-core score.

    So if single core score is 3000 and you have 4 vCPUs then multi-core score should be 80% of 3000 x 4 cores = 9600. If the host is under heavy load the multi-core score will go lower and lower.

    I think you are on the right track. This is largely in part due to how the underlying Hypervisor handles multi-core VMs. The way I understand it, is that in a multi-core VM, the Hypervisor has to wait for that number of cores to be ready to process before it signals to the VM that it can keep running.

    IE: In your example, a 4 core VM, the underlying hypervisor will have to wait to have 4 cores waiting for work before it will tell the VM that it's cores are available.



  • @dafyre said in Geekbench observations:

    @Pete-S said in Geekbench observations:

    The relationship between the single-core and multi-core score should be about 80% of theoretical max on the multi-core score.

    So if single core score is 3000 and you have 4 vCPUs then multi-core score should be 80% of 3000 x 4 cores = 9600. If the host is under heavy load the multi-core score will go lower and lower.

    I think you are on the right track. This is largely in part due to how the underlying Hypervisor handles multi-core VMs. The way I understand it, is that in a multi-core VM, the Hypervisor has to wait for that number of cores to be ready to process before it signals to the VM that it can keep running.

    IE: In your example, a 4 core VM, the underlying hypervisor will have to wait to have 4 cores waiting for work before it will tell the VM that it's cores are available.

    I've read that before but I think it is some old feature of very old hypervisors called strict co-scheduling. It's not used anymore.

    Nowadays basically every hypervisor has their scheduler that puts vCPU on real pCPUs according to the time share principle. So every vCPU get's a piece of the pie. But it has to account for hyperthreading, more than one CPU socket (NUMA), power saving, VM priority and other things. The underlying principle is though that all VMs and their vCPUs should get their fair share of CPU time.

    Some hypervisors have different scheduler algorithms so you can pick other ways of scheduling that might be more optimized for your workload.



  • The reason Geekbench results are strange is if we look at just the single core results. That means one vCPU is running at full load while the other rests.

    If each physical core can deliver 100% then 6 cores will deliver 600%. We also have hyperthreading that will give another say up to 30% boost. So we have a total of 780% of CPU power to split among 16 VMs. That's about 50% for each VM.

    In the single-core benchmark 16 VMs are running at full load on one core but still they show 80% of the score of when only one VM is running. That is defying logic.

    Basically Geekbench gives higher scores than what the actual processing power is. Maybe it's the timing of the tasks that gets screwed up, maybe it something else. Either way Geekbench single-core scores show inaccurate speed on virtual machines.



  • @Pete-S said in Geekbench observations:

    @dafyre said in Geekbench observations:

    @Pete-S said in Geekbench observations:

    The relationship between the single-core and multi-core score should be about 80% of theoretical max on the multi-core score.

    So if single core score is 3000 and you have 4 vCPUs then multi-core score should be 80% of 3000 x 4 cores = 9600. If the host is under heavy load the multi-core score will go lower and lower.

    I think you are on the right track. This is largely in part due to how the underlying Hypervisor handles multi-core VMs. The way I understand it, is that in a multi-core VM, the Hypervisor has to wait for that number of cores to be ready to process before it signals to the VM that it can keep running.

    IE: In your example, a 4 core VM, the underlying hypervisor will have to wait to have 4 cores waiting for work before it will tell the VM that it's cores are available.

    I've read that before but I think it is some old feature of very old hypervisors called strict co-scheduling. It's not used anymore.

    Nowadays basically every hypervisor has their scheduler that puts vCPU on real pCPUs according to the time share principle. So every vCPU get's a piece of the pie. But it has to account for hyperthreading, more than one CPU socket (NUMA), power saving, VM priority and other things. The underlying principle is though that all VMs and their vCPUs should get their fair share of CPU time.

    Some hypervisors have different scheduler algorithms so you can pick other ways of scheduling that might be more optimized for your workload.

    Depends, SMP doesn't really allow for that, all cores have to be in lock step. Only is AMP is supported can the hypervisor do that. It requires the hypervisor and system above it together to do non-SMP processing.


Log in to reply