• Skip to main content
  • Skip to primary sidebar
UT Shield
The University of Texas at Austin

Computer Hardware

February 17, 2025, Filed Under: Computer Architecture, Computer Hardware, Performance

Single-core memory bandwidth: Latency, Bandwidth, and Concurrency

The Multicore Era Over the past ~15 years, server processors from Intel and AMD have evolved from the early quad-core processors to the current monsters with over 50 cores per socket.  The memory subsystems have grown at similar rates, from 3-4 DRAM channels at 1.333 GT/s transfer rates to 8-12… read more 

August 28, 2023, Filed Under: Cache Coherence Implementations, Cache Coherence Protocols, Computer Architecture

“Memory directories” in Intel processors

One of the (many) minimally documented features of recent Intel processor implementations is the “memory directory”.   This is used in multi-socket systems to reduce cache coherence traffic between sockets. I have referred to this in various presentations as: “A Memory Directory is one or more bits per cache line… read more 

April 25, 2023, Filed Under: Computer Architecture, Computer Hardware, Performance

The evolution of single-core bandwidth in multicore processors

The primary metric for memory bandwidth in multicore processors is that maximum sustained performance when using many cores.  For most high-end processors these values have remained in the range of 75% to 85% of the peak DRAM bandwidth of the system over the past 15-20 years — an amazing accomplishment… read more 

  • Page 1
  • Page 2
  • Page 3
  • Interim pages omitted …
  • Page 13
  • Go to Next Page »

Primary Sidebar

Recent Posts

  • Single-core memory bandwidth: Latency, Bandwidth, and Concurrency
  • Dr. Bandwidth is moving on…
  • The evolution of single-core bandwidth in multicore systems — update
  • “Memory directories” in Intel processors
  • The evolution of single-core bandwidth in multicore processors

Tags

accelerated computing arithmetic cache communication configuration coprocessor Distributed cache DRAM Hash functions high performance computing Knights Landing memory bandwidth memory latency microprocessors MMIO MTRR Multicore processors Opteron STREAM benchmark synchronization TLB Virtual Memory Xeon Phi

UT Home | Emergency Information | Site Policies | Web Accessibility | Web Privacy | Adobe Reader

© The University of Texas at Austin 2025