“Memory Bandwidth and System Balance in HPC Systems” If you are planning to attend the SuperComputing 2016 conference in Salt Lake City next month, be sure to reserve a spot on your calendar for my talk on Wednesday afternoon (4:15pm-5:00pm). I will be talking about the technology and market trends… read more
Performance
Memory Bandwidth Requirements of the HPL benchmark
The High Performance LINPACK (HPL) benchmark is well known for delivering a high fraction of peak floating-point performance. The (historically) excellent scaling of performance as the number of processors is increased and as the frequency is increased suggests that memory bandwidth has not been a performance limiter. But this does… read more
Counting Stall Cycles on the Intel Sandy Bridge Processor
Intuition might suggest that defining what a “stall cycle” is on a processor should be relatively straightforward. For some processors, this is actually the case — particularly in-order processors with a very small number of execution units and a very small number of non-pipelined instructions. For modern out-of-order processors, coming… read more