Category Archives: Benchmarking

NVIDIA Tesla P100 PCI-E 16GB GPU Accelerator (Pascal GP100) Up Close

Posted on December 28, 2016 by Eliot Eshelman

NVIDIA’s new Tesla P100 PCI-E GPU is a big step up for HPC users, and for GPU users in general. Although other workloads have been leveraging the newer “Maxwell” architecture, HPC applications have been using “Kepler” GPUs for a couple … Continue reading →

More Tips on OpenACC Acceleration

Posted on July 25, 2016 by John Murphy

One blog post may not be enough to present all tips for performance acceleration using OpenACC. So here, more tips on OpenACC acceleration are provided, complementing our previous blog post on accelerating code with OpenACC. Further tips discussed here are: … Continue reading →

NVIDIA Tesla M40 24GB GPU Accelerator (Maxwell GM200) Up Close

Posted on April 1, 2016 by Eliot Eshelman

NVIDIA has announced a new version of their popular Tesla M40 GPU – one with 24GB of high-speed GDDR5 memory. The name hasn’t really changed – the new GPU is named NVIDIA Tesla M40 24GB. If you are curious about … Continue reading →

DDR4 Memory on Xeon E5-2600v3 with 3 DIMMs per channel

Posted on March 24, 2016 by Marc Rocque

This week I had the opportunity to run the STREAM memory benchmark on a Microway 2U NumberSmasher server which supports up to 3 DIMMs per channel. In practice, this system is typically configured with 768GB or 1.5TB of DDR4 memory. … Continue reading →

NVIDIA Tesla M40 12GB GPU Accelerator (Maxwell GM200) Up Close

Posted on February 10, 2016 by Eliot Eshelman

With the release of Tesla M40, NVIDIA continues to diversify its professional compute GPU lineup. Designed specifically for Deep Learning applications, the M40 provides 7 TFLOPS of single-precision floating point performance and 12GB of high-speed GDDR5 memory. It works extremely … Continue reading →

Caffe Deep Learning Tutorial using NVIDIA DIGITS on Tesla K80 & K40 GPUs

Posted on September 17, 2015 by John Murphy

In this Caffe deep learning tutorial, we will show how to use DIGITS in order to train a classifier on a small image set. Along the way, we’ll see how to adjust certain run-time parameters, such as the learning rate, … Continue reading →

DDR4 RDIMM and LRDIMM Performance Comparison

Posted on July 10, 2015 by Marc Rocque

Recently, while carrying out memory testing in our integration lab, Lead Systems Integrator, Rick Warner, was able to clearly identify when it is appropriate to choose load-reduced DIMMs (LRDIMM) and when it is appropriate to choose registered DIMMs (RDIMM) for … Continue reading →

Common PCI-Express Myths for GPU Computing Users

Posted on May 4, 2015 by Brett Newman

At Microway we design a lot of GPU computing systems. One of the strengths of GPU-compute is the flexibility PCI-Express bus. Assuming the server has appropriate power and thermals, it enables us to attach GPUs with no special interface modifications. We can … Continue reading →

How to Benchmark GROMACS GPU Acceleration on HPC Clusters

Posted on October 21, 2014 by Jan Smith

We know that many of our readers are interested in seeing how molecular dynamics applications perform with GPUs, so we are continuing to highlight various packages. This time we will be looking at GROMACS, a well-established and free-to-use (under GNU GPL) … Continue reading →

Benchmark MATLAB GPU Acceleration on NVIDIA Tesla K40 GPUs

Posted on October 17, 2014 by Eliot Eshelman

MATLAB is a well-known and widely-used application – and for good reason. It functions as a powerful, yet easy-to-use, platform for technical computing. With support for a variety of parallel execution methods, MATLAB also performs well. Support for running MATLAB … Continue reading →

Category Archives: Benchmarking

NVIDIA Tesla P100 PCI-E 16GB GPU Accelerator (Pascal GP100) Up Close

More Tips on OpenACC Acceleration

NVIDIA Tesla M40 24GB GPU Accelerator (Maxwell GM200) Up Close

DDR4 Memory on Xeon E5-2600v3 with 3 DIMMs per channel

NVIDIA Tesla M40 12GB GPU Accelerator (Maxwell GM200) Up Close

Caffe Deep Learning Tutorial using NVIDIA DIGITS on Tesla K80 & K40 GPUs

DDR4 RDIMM and LRDIMM Performance Comparison

Common PCI-Express Myths for GPU Computing Users

How to Benchmark GROMACS GPU Acceleration on HPC Clusters

Benchmark MATLAB GPU Acceleration on NVIDIA Tesla K40 GPUs

Archives

Meta

Talk to an Expert

Take a Test Drive

Configure Your Solution

Subscribe to Microway’s Technical Newsletter

HPC-Tech-Tip Categories

Subscribe to Blog

Technologies

Products

Knowledge Center

Pre-Configured Systems

NVIDIA DGX H100™

NVIDIA DGX POD™

EOL – NVIDIA DGX A100™

AI Anywhere Solution

Category Archives: Benchmarking

Archives

Meta

Talk to an Expert

Take a Test Drive

Configure Your Solution

Subscribe to Microway’s Technical Newsletter

HPC-Tech-Tip Categories

HPC-Tech-Tip Tags

Subscribe to Blog