After much too long a delay, version 5.10 of the STREAM benchmark has been released (at least in the C language version).
Although version 5.10 of the benchmark still measures exactly the same thing as previous versions, a number of long-awaited features have finally been integrated.
- Updated Validation Code
- Array indexing now allows arrays with more than 2 billion elements
- Data type used can now be overridden from the default “double” to “float” with a single compile flag
- Many small output formatting changes to account for computers getting bigger and faster
The validation code update is the biggest change to version 5.10 of stream.c.
With previous versions, the validation code was subject to accumulated round-off error that could cause the code to report that validation failed with large array sizes — even if nothing was actually wrong. The revised code eliminates this problem and has been tested to array sizes of 10 billion elements with no problems.
Previous version of STREAM were limited to 32-bit array indices. Version 5.10 defines the array indices using a type that will map to a 64-bit integer on 64-bit machine — thus allowing arrays with more than 2 billion elements. Most compilers require an additional command-line flag like “-mcmodel=medium” to allow full 64-bit addressing. The changes to STREAM in version 5.10 are required in addition to the extra command-line flag.
Dr. Bandwidth also found eight older submissions (from 2009 through early 2012) that somehow got lost in my mailbox and never posted to the site. These are listed on the STREAM benchmark What’s New page.
Along with these older submissions, four new submissions have just been added to the site, ranging from a Raspberry Pi delivering about 200 MB/s to a Xeon Phi SE10P coprocessor delivering over 160,000 MB/s — that’s an 800 to 1 ratio of sustained memory bandwidths measured for single-chip systems!
Users of systems at the Texas Advanced Computing Center will be interested in seeing new results posted for three different components of the Stampede system:
- Stampede Compute Nodes: Dell_DCS8000 servers with two Xeon E5-2680 (8-core, 2.7 GHz) processors
- Stampede Coprocessors: Intel_XeonPhi_SE10P Coprocessor (61-core, 1.1 GHz)
- Stampede Large Memory Nodes: Dell_PowerEdge_820 servers with four Xeon E5-4650 (8-core, 2.7 GHz) processors
MICHAEL R HINES says
your website is down…. can’t download the benchmark
John D. McCalpin, Ph.D. says
Looks like it was a temporary glitch — the site is available today.