Recently, while carrying out memory testing in our integration lab, Lead Systems Integrator, Rick Warner, was able to clearly identify when it is appropriate to choose load-reduced DIMMs (LRDIMM) and when it is appropriate to choose registered DIMMs (RDIMM) for servers running large amounts of DDR4 RAM (i.e., 256 Gigabytes and greater). The critical factors to consider are latency, speed, and capacity, along with what your computing objectives are with respect to them.
Misconceptions on Load Reduced DIMM Performance
Load-reduced DIMMs were built so that high-speed memory controllers in CPUs could drive larger quantities of memory. Thus, it’s often assumed that LRDIMMs will offer the best performance for memory-dense servers. This impression is strengthened by the fact that Intel’s guide for DDR4 memory population shows LRDIMMs running at a higher frequency than RDIMMs (e.g., 2133MHz vs 1866MHz). However, as we’ll show below, there are greater factors at play.
RDIMM vs LRDIMM Performance Testing
Using the STREAM memory benchmark, Rick took a look at 1 DIMM and 2 DIMMs per channel configurations using DDR4 LRDIMMS and RDIMMs on a Supermicro X10DAi motherboard with two Intel Xeon E5-2687W v3 CPU’s. Both our WhisperStation and WhisperStation for R are available in this configuration. We also have several Xeon Rackmount Servers which support this configuration.
For each case, the DIMM speed was forced to 2133MHz in the BIOS. Tests were run with both RDIMMs and LRDIMMs in 256GB and 512GB configurations.
LRDIMM Benchmark Results
Function | Best Rate MB/s | Avg. Time | Min. Time | Max. Time |
---|---|---|---|---|
Copy | 81,383.5 | 0.004005 | 0.003932 | 0.004151 |
Scale | 95,746.7 | 0.003409 | 0.003342 | 0.003561 |
Add | 109,661.0 | 0.004505 | 0.004377 | 0.004862 |
Triad | 109,315.6 | 0.004490 | 0.004391 | 0.004771 |
Function | Best Rate MB/s | Avg. Time | Min. Time | Max. Time |
---|---|---|---|---|
Copy | 72,499.2 | 0.004461 | 0.004414 | 0.004546 |
Scale | 83,572.7 | 0.003901 | 0.003829 | 0.004036 |
Add | 95,979.5 | 0.005103 | 0.005001 | 0.005220 |
Triad | 96,541.0 | 0.005105 | 0.004972 | 0.005265 |
* for LRDIMMs, the 512GB configuration automatically operates at 2133MHz
LRDIMM Performance Summary
From these tests, we concluded that the latency imposed by the LRDIMMs results in approximately 12% reduction in overall performance when doubling the amount of RAM from 256GB to 512GB.
RDIMM Benchmark Results
Rick then tested RDIMMs using the same system for comparison (with 256GB and 512GB DDR4 memory configurations). Below are the stream results.
Function | Best Rate MB/s | Avg. Time | Min. Time | Max. Time |
---|---|---|---|---|
Copy | 82,707.5 | 0.003939 | 0.003869 | 0.004093 |
Scale | 101,973.7 | 0.003243 | 0.003138 | 0.003471 |
Add | 111,966.3 | 0.004502 | 0.004287 | 0.004978 |
Triad | 110,881.0 | 0.004468 | 0.004329 | 0.004843 |
Function | Best Rate MB/s | Avg. Time | Min. Time | Max. Time |
---|---|---|---|---|
Copy | 75,049.1 | 0.004314 | 0.004264 | 0.004405 |
Scale | 93,812.6 | 0.003460 | 0.003411 | 0.003550 |
Add | 103,091.1 | 0.004729 | 0.004656 | 0.004969 |
Triad | 103,493.9 | 0.004704 | 0.004638 | 0.004909 |
* for RDIMMs, the 512GB configuration requires the memory speed to manually be increased to 2133MHz
RDIMM Performance Summary
Just as we saw with LRDIMMs, there is a reduction in performance between 1 DIMM per channel and 2 DIMMs per channel when using RDIMMs. However, this penalty is reduced to approximately 7% (compared to the 12% penalty suffered by LRDIMMs).
Side-by-Side Comparison of RDIMM and LRDIMM Performance
For clarity, here is a side by side table of DDR4 memory performance comparing LRDIMMs to RDIMMs. Note that RDIMM memory bandwidth is higher than LRDIMM bandwidth in every case.
1 DIMM Per Channel Best Rate (MB/s) | 2 DIMMs Per Channel Best Rate (MB/s) | |||
---|---|---|---|---|
Function | LRDIMM | RDIMM | LRDIMM | RDIMM |
Copy | 81,383.5 | 82,707.5 | 72,499.2 | 75,049.1 |
Scale | 95,746.7 | 101,973.7 | 83,572.7 | 93,812.6 |
Add | 109,661.0 | 111,966.3 | 95,979.5 | 103,091.1 |
Triad | 109,315.6 | 110,881.0 | 96,541.0 | 103,493.9 |
When Registered DIMMs (RDIMMs) are Best
Many of our HPC customers are looking for high speed and low latency. In that realm, RDIMMs are the hands down winner. At slightly cheaper cost and with the ability to ramp up memory frequency on certain motherboards, they are the right choice for fast memory performance.
When Load-Reduced DIMMs (LRDIMMs) are Best
When very large quantities of RAM are the goal, then LRDIMMs are the way to go. In this chart from Intel’s Grantly Platform Memory Configuration Guide, you can see that when packing a system full of RAM you can achieve twice the capacity from LRDIMMs. However, 64GB DDR4 LRDIMMs are still quite costly. There are also specific configurations using 3 DIMMs per channel that require LRDIMMs. Contact one of our experts to discuss the best options when you are considering servers with more than 512GB memory.
SKU | Max DIMMs in Platform | Number of CPU Sockets | RDIMM Config | LRDIMM Config |
---|---|---|---|---|
E5-1600 v3 | 12 DIMMS | 1 | 384GB (12x32GB) @ 1600MHz | 768GB (12x64GB) @ 1600MHz |
E5-2600 v3 | 24 DIMMs | 2 | 768GB (24x32GB) @ 1600MHz | 1.5TB (24x64GB) @ 1600MHz |
E5-4600 v3 | 48 DIMMs | 4 | 1.5TB (48x32GB) @ 1600MHz | 3TB (48x64GB) @ 1600MHz |
Choosing between LRDIMMs and RDIMMs depends entirely on what performance characteristics meet the needs of your applications. Careful consideration of latency, speed and capacity as applied to your problem will show you the way to go. Our engineering team can help you work your way through this important design choice. Contact us or give us a call for assistance choosing the HPC platform that works best for you.