Cache Coherence Protocols

August 28, 2023, Filed Under: Cache Coherence Implementations, Cache Coherence Protocols, Computer Architecture

“Memory directories” in Intel processors

One of the (many) minimally documented features of recent Intel processor implementations is the “memory directory”. This is used in multi-socket systems to reduce cache coherence traffic between sockets. I have referred to this in various presentations as: “A Memory Directory is one or more bits per cache line… read more

February 18, 2019, Filed Under: Cache Coherence Implementations, Cache Coherence Protocols, Computer Architecture

Intel’s future “CLDEMOTE” instruction

I recently saw a reference to a future Intel “Atom” core called “Tremont” and ran across an interesting new instruction, “CLDEMOTE”, that will be supported in “Future Tremont and later” microarchitectures (ref: “Intel® Architecture Instruction Set Extensions and Future Features Programming Reference”, document 319433-035, October 2018). The “CLDEMOTE” instruction is… read more

January 7, 2019, Filed Under: Cache Coherence Implementations, Cache Coherence Protocols, Computer Architecture, Linux, Performance Counters

SC18 paper: HPL and DGEMM performance variability on Intel Xeon Platinum 8160 processors

Here are the annotated slides from my SC18 presentation on Snoop Filter Conflicts that cause performance variability in HPL and DGEMM on the Xeon Platinum 8160 processor. This slide presentation includes data (not included in the paper) showing that Snoop Filter Conflicts occur in all Intel Scalable Processors (a.k.a., “Skylake… read more

UT Austin

About