-
Archives
- June 2020
- May 2020
- March 2020
- August 2019
- June 2019
- April 2019
- March 2019
- May 2018
- April 2018
- October 2017
- September 2017
- July 2017
- April 2017
- February 2017
- January 2017
- December 2016
- August 2016
- July 2016
- June 2016
- April 2016
- March 2016
- February 2016
- January 2016
- October 2015
- September 2015
- July 2015
- June 2015
- May 2015
- April 2015
- November 2014
- October 2014
- September 2014
- June 2014
- May 2014
- April 2014
- March 2014
- February 2014
- January 2014
- December 2013
- November 2013
- October 2013
- September 2013
- August 2013
- July 2013
- June 2013
- April 2013
- March 2013
- February 2013
- January 2013
- September 2012
- May 2012
- April 2012
- January 2012
- December 2011
- September 2011
- August 2011
- July 2011
-
Meta
Category Archives: Benchmarking
NVIDIA Tesla P100 PCI-E 16GB GPU Accelerator (Pascal GP100) Up Close
NVIDIA’s new Tesla P100 PCI-E GPU is a big step up for HPC users, and for GPU users in general. Although other workloads have been leveraging the newer “Maxwell” architecture, HPC applications have been using “Kepler” GPUs for a couple … Continue reading
More Tips on OpenACC Acceleration
One blog post may not be enough to present all tips for performance acceleration using OpenACC. So here, more tips on OpenACC acceleration are provided, complementing our previous blog post on accelerating code with OpenACC. Further tips discussed here are: … Continue reading
NVIDIA Tesla M40 24GB GPU Accelerator (Maxwell GM200) Up Close
NVIDIA has announced a new version of their popular Tesla M40 GPU – one with 24GB of high-speed GDDR5 memory. The name hasn’t really changed – the new GPU is named NVIDIA Tesla M40 24GB. If you are curious about … Continue reading
DDR4 Memory on Xeon E5-2600v3 with 3 DIMMs per channel
This week I had the opportunity to run the STREAM memory benchmark on a Microway 2U NumberSmasher server which supports up to 3 DIMMs per channel. In practice, this system is typically configured with 768GB or 1.5TB of DDR4 memory. … Continue reading
NVIDIA Tesla M40 12GB GPU Accelerator (Maxwell GM200) Up Close
With the release of Tesla M40, NVIDIA continues to diversify its professional compute GPU lineup. Designed specifically for Deep Learning applications, the M40 provides 7 TFLOPS of single-precision floating point performance and 12GB of high-speed GDDR5 memory. It works extremely … Continue reading
Caffe Deep Learning Tutorial using NVIDIA DIGITS on Tesla K80 & K40 GPUs
In this Caffe deep learning tutorial, we will show how to use DIGITS in order to train a classifier on a small image set. Along the way, we’ll see how to adjust certain run-time parameters, such as the learning rate, … Continue reading
DDR4 RDIMM and LRDIMM Performance Comparison
Recently, while carrying out memory testing in our integration lab, Lead Systems Integrator, Rick Warner, was able to clearly identify when it is appropriate to choose load-reduced DIMMs (LRDIMM) and when it is appropriate to choose registered DIMMs (RDIMM) for … Continue reading
Common PCI-Express Myths for GPU Computing Users
At Microway we design a lot of GPU computing systems. One of the strengths of GPU-compute is the flexibility PCI-Express bus. Assuming the server has appropriate power and thermals, it enables us to attach GPUs with no special interface modifications. We can … Continue reading
How to Benchmark GROMACS GPU Acceleration on HPC Clusters
We know that many of our readers are interested in seeing how molecular dynamics applications perform with GPUs, so we are continuing to highlight various packages. This time we will be looking at GROMACS, a well-established and free-to-use (under GNU GPL) … Continue reading
Benchmark MATLAB GPU Acceleration on NVIDIA Tesla K40 GPUs
MATLAB is a well-known and widely-used application – and for good reason. It functions as a powerful, yet easy-to-use, platform for technical computing. With support for a variety of parallel execution methods, MATLAB also performs well. Support for running MATLAB … Continue reading