One of the (many) minimally documented features of recent Intel processor implementations is the “memory directory”. This is used in multi-socket systems to reduce cache coherence traffic between sockets. I have referred to this in various presentations as: “A Memory Directory is one or more bits per cache line… read more
Multicore processors
Mapping addresses to L3/CHA slices in Intel processors
Starting with the Xeon E5 processors “Sandy Bridge EP” in 2012, all of Intel’s mainstream multicore server processors have included a distributed L3 cache with distributed coherence processing. The L3 cache is divided into “slices”, which are distributed around the chip — typically one “slice” for each processor core. Each… read more