While working on the implementation of the MPI version of the STREAM benchmark, I realized that there were some subtleties in timing that could easily lead to inaccurate and/or misleading results. This post is a transcription of my notes as I looked at the issues…. Primary requirement: I want a… read more
Performance
New Year’s Updates
As part of my attempt to become organized in 2019, I found several draft blog entries that had never been completed and made public. This week I updated three of those posts — two really old ones (primarily of interest to computer architecture historians), and one from 2018: July 2012:… read more
Using hardware performance counters to determine how often both logical processors are active on an Intel CPU
Most Intel microprocessors support “HyperThreading” (Intel’s trademark for their implementation of “simultaneous multithreading”) — which allows the hardware to support (typically) two “Logical Processors” for each physical core. Processes running on the two Logical Processors share most of the processor resources (particularly caches and execution units). Some workloads (particularly heterogeneous… read more