Photo of dual Xeon E5-2600v3 motherboard with DDR4 ECC RDIMM memory

DDR4 RDIMM and LRDIMM Performance Comparison

Recently, while carrying out memory testing in our integration lab, Lead Systems Integrator, Rick Warner,  was able to clearly identify when it is appropriate to choose load-reduced DIMMs (LRDIMM) and when it is appropriate to choose registered DIMMs (RDIMM) for servers running large amounts of DDR4 RAM (i.e., 256 Gigabytes and greater). The critical factors to consider are latency, speed, and capacity, along with what your computing objectives are with respect to them.

Misconceptions on Load Reduced DIMM Performance

Load-reduced DIMMs were built so that high-speed memory controllers in CPUs could drive larger quantities of memory. Thus, it’s often assumed that LRDIMMs will offer the best performance for memory-dense servers. This impression is strengthened by the fact that Intel’s guide for DDR4 memory population shows LRDIMMs running at a higher frequency than RDIMMs (e.g., 2133MHz vs 1866MHz). However, as we’ll show below, there are greater factors at play.

RDIMM vs LRDIMM Performance Testing

Using the STREAM memory benchmark, Rick took a look at 1 DIMM and 2 DIMMs per channel configurations using DDR4 LRDIMMS and RDIMMs on a Supermicro X10DAi motherboard with two Intel Xeon E5-2687W v3 CPU’s. Both our WhisperStation and WhisperStation for R are available in this configuration. We also have several Xeon Rackmount Servers which support this configuration.

For each case, the DIMM speed was forced to 2133MHz in the BIOS. Tests were run with both RDIMMs and LRDIMMs in 256GB and 512GB configurations.

LRDIMM Benchmark Results

One LRDIMM Per Channel — 256GB RAM @ 2133MHz
Function Best Rate MB/s Avg. Time Min. Time Max. Time
Copy 81,383.5 0.004005 0.003932 0.004151
Scale 95,746.7 0.003409 0.003342 0.003561
Add 109,661.0 0.004505 0.004377 0.004862
Triad 109,315.6 0.004490 0.004391 0.004771

 

Two LRDIMMs Per Channel — 512GB RAM @ 2133MHz*
Function Best Rate MB/s Avg. Time Min. Time Max. Time
Copy 72,499.2 0.004461 0.004414 0.004546
Scale 83,572.7 0.003901 0.003829 0.004036
Add 95,979.5 0.005103 0.005001 0.005220
Triad 96,541.0 0.005105 0.004972 0.005265

* for LRDIMMs, the 512GB configuration automatically operates at 2133MHz

LRDIMM Performance Summary

From these tests, we concluded that the latency imposed by the LRDIMMs results in approximately 12% reduction in overall performance when doubling the amount of RAM from 256GB to 512GB.

RDIMM Benchmark Results

Rick then tested RDIMMs using the same system for comparison (with 256GB and 512GB DDR4 memory configurations). Below are the stream results.

One RDIMM Per Channel — 256GB RAM @ 2133MHz
Function Best Rate MB/s Avg. Time Min. Time Max. Time
Copy 82,707.5 0.003939 0.003869 0.004093
Scale 101,973.7 0.003243 0.003138 0.003471
Add 111,966.3 0.004502 0.004287 0.004978
Triad 110,881.0 0.004468 0.004329 0.004843

 

Two RDIMMs Per Channel — 512GB RAM @ 2133MHz*
Function Best Rate MB/s Avg. Time Min. Time Max. Time
Copy 75,049.1 0.004314 0.004264 0.004405
Scale 93,812.6 0.003460 0.003411 0.003550
Add 103,091.1 0.004729 0.004656 0.004969
Triad 103,493.9 0.004704 0.004638 0.004909

* for RDIMMs, the 512GB configuration requires the memory speed to manually be increased to 2133MHz

RDIMM Performance Summary

Just as we saw with LRDIMMs, there is a reduction in performance between 1 DIMM per channel and 2 DIMMs per channel when using RDIMMs. However, this penalty is reduced to approximately 7% (compared to the 12% penalty suffered by LRDIMMs).

Side-by-Side Comparison of RDIMM and LRDIMM Performance

For clarity, here is a side by side table of DDR4 memory performance comparing LRDIMMs to RDIMMs. Note that RDIMM memory bandwidth is higher than LRDIMM bandwidth in every case.

LRDIMMs and RDIMMs Compared
1 DIMM Per Channel Best Rate (MB/s) 2 DIMMs Per Channel Best Rate (MB/s)
Function LRDIMM RDIMM LRDIMM RDIMM
Copy 81,383.5 82,707.5 72,499.2 75,049.1
Scale 95,746.7 101,973.7 83,572.7 93,812.6
Add 109,661.0 111,966.3 95,979.5 103,091.1
Triad 109,315.6 110,881.0 96,541.0 103,493.9

 

When Registered DIMMs (RDIMMs) are Best

Many of our HPC customers are looking for high speed and low latency. In that realm, RDIMMs are the hands down winner. At slightly cheaper cost and with the ability to ramp up memory frequency on certain motherboards, they are the right choice for fast memory performance.

When Load-Reduced DIMMs (LRDIMMs) are Best

When very large quantities of RAM are the goal, then LRDIMMs are the way to go. In this chart from Intel’s Grantly Platform Memory Configuration Guide, you can see that when packing a system full of RAM you can achieve twice the capacity from LRDIMMs. However, 64GB DDR4 LRDIMMs are still quite costly.  There are also specific configurations using 3 DIMMs per channel that require LRDIMMs.  Contact one of our experts to discuss the best options when you are considering servers with more than 512GB memory.

Memory Configuration
SKU Max DIMMs in Platform Number of CPU Sockets RDIMM Config LRDIMM Config
E5-1600 v3 12 DIMMS 1 384GB
(12x32GB)
@ 1600MHz
768GB
(12x64GB)
@ 1600MHz
E5-2600 v3 24 DIMMs 2 768GB
(24x32GB)
@ 1600MHz
1.5TB
(24x64GB)
@ 1600MHz
E5-4600 v3 48 DIMMs 4 1.5TB
(48x32GB)
@ 1600MHz
3TB
(48x64GB)
@ 1600MHz

lots-o-ram

Choosing between LRDIMMs and RDIMMs depends entirely on what performance characteristics meet the needs of your applications. Careful consideration of latency, speed and capacity as applied to your problem will show you the way to go. Our engineering team can help you work your way through this important design choice. Contact us or give us a call for assistance choosing the HPC platform that works best for you.

This entry was posted in Benchmarking, Hardware and tagged , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *