@travisdh1 said in Performance of Intel Xeon Scalable 6146 versus E5-2667 v4 in the real world...:
@marcinozga said in Performance of Intel Xeon Scalable 6146 versus E5-2667 v4 in the real world...:
@flomer said in Performance of Intel Xeon Scalable 6146 versus E5-2667 v4 in the real world...:
If I have set something up the wrong way I would be delighted if someone can point this out for me. How can I check if something is wrong? All three systems run the Rocks cluster distribution (CentOS with extras), #1 version 6.1, #2 version 6.2 and #3 version 7.0.
I don't have any experience with HPC, but based on the above it seems that linux kernel version might be the issue. Centos 6 comes with 2.6.32 kernel, and 7 with 3.10. Either test all 3 clusters on same kernel line, or do some research if there was any performance drop between kernel versions above.
Managing cores above a certain number becomes difficult. Linus himself used to complain that managing more than around 16 cores required an entire core just for the scheduler. They've improved things a bit, but high numbers of cores will always require more work to manage right.
Running in an HPC environment, you'll also have to pay attention to things like program size (does it fit into L1/L2 cache), dataset size (does it fit into either L3 or available RAM).
I'd suspect that even with the faster RAM, getting data in and out of each core could be slowing things down. Many more cores and only slightly faster RAM would be one choke point to investigate.
This is really one of the oddball use cases where servers are running, but not in a virtualized environment. That's what most server hardware is designed around these days. You could have any number of performance choke points.
I don't think that's the problem here, as he's running jobs on 128 cores, so having more of them shouldn't matter as they will sit idle.