• Skip to main content
  • Skip to primary sidebar
UT Shield
The University of Texas at Austin

Computer Hardware

November 22, 2016, Filed Under: Computer Architecture, Computer Hardware

SC16 Invited Talk: Memory Bandwidth and System Balance in HPC Systems

I have been involved in HPC for over 30 years: 12 years as student & faculty user in ocean modeling, 12 years as a performance analyst and system architect at SGI, IBM, and AMD, and over 7 years as a research scientist at TACC. This history is based on my… read more 

November 22, 2016, Filed Under: Cache Coherence Implementations, Cache Coherence Protocols, Computer Architecture

Some notes on producer/consumer communication in cached processors

In a recent Intel Software Developer Forum discussion (https://software.intel.com/en-us/forums/intel-moderncode-for-parallel-architectures/topic/700477), I put together a few notes on the steps required for a single-producer, single-consumer communication using separate cache lines for “data” and “flag” values. Although this was not a carefully-considered formal analysis, I think it is worth re-posting here as a… read more 

November 5, 2016, Filed Under: Algorithms, Computer Architecture, Computer Hardware, Performance

Intel discloses “vector+SIMD” instructions for future processors

The art and science of microprocessor architecture is a never-ending struggling to balance complexity, verifiability, usability, expressiveness, compactness, ease of encoding/decoding, energy consumption, backwards compatibility, forwards compatibility, and other factors.   In recent years the trend has been to increase core-level performance by the use of SIMD vector instructions, and… read more 

  • « Go to Previous Page
  • Page 1
  • Interim pages omitted …
  • Page 4
  • Page 5
  • Page 6
  • Page 7
  • Page 8
  • Interim pages omitted …
  • Page 13
  • Go to Next Page »

Primary Sidebar

Recent Posts

  • Single-core memory bandwidth: Latency, Bandwidth, and Concurrency
  • Dr. Bandwidth is moving on…
  • The evolution of single-core bandwidth in multicore systems — update
  • “Memory directories” in Intel processors
  • The evolution of single-core bandwidth in multicore processors

Tags

accelerated computing arithmetic cache communication configuration coprocessor Distributed cache DRAM Hash functions high performance computing Knights Landing memory bandwidth memory latency microprocessors MMIO MTRR Multicore processors Opteron STREAM benchmark synchronization TLB Virtual Memory Xeon Phi

UT Home | Emergency Information | Site Policies | Web Accessibility | Web Privacy | Adobe Reader

© The University of Texas at Austin 2025