Linux

January 7, 2019, Filed Under: Cache Coherence Implementations, Cache Coherence Protocols, Computer Architecture, Linux, Performance Counters

SC18 paper: HPL and DGEMM performance variability on Intel Xeon Platinum 8160 processors

Here are the annotated slides from my SC18 presentation on Snoop Filter Conflicts that cause performance variability in HPL and DGEMM on the Xeon Platinum 8160 processor. This slide presentation includes data (not included in the paper) showing that Snoop Filter Conflicts occur in all Intel Scalable Processors (a.k.a., “Skylake… read more

May 30, 2013, Filed Under: Accelerated Computing, Computer Hardware, Linux

Coherence with Cached Memory-Mapped IO

In response to my previous blog entry, a question was asked about how to manage coherence for cached memory-mapped IO regions. Here are some more details… Maintaining Coherence with Cached Memory-Mapped IO For the “read-only” range, cached copies of MMIO lines will never be invalidated by external traffic, so repeated… read more

May 29, 2013, Filed Under: Accelerated Computing, Computer Hardware, Linux

Notes on Cached Access to Memory-Mapped IO Regions

When attempting to build heterogeneous computers with “accelerators” or “coprocessors” on PCIe interfaces, one quickly runs into asymmetries between the data transfer capabilities of processors and IO devices. These asymmetries are often surprising — the tremendously complex processor is actually less capable of generating precisely controlled high-performance IO transactions than… read more

Social Widgets powered by AB-WebLog.com.

UT Austin

About