Tag Archives: gpu

Deep Learning Benchmarks of NVIDIA Tesla P100 PCIe, Tesla K80, and Tesla M40 GPUs

Sources of CPU benchmarks, used for estimating performance on similar workloads, have been available throughout the course of CPU development. For example, the Standard Performance Evaluation Corporation has compiled a large set of applications benchmarks, running on a variety of … Continue reading

Comparing NVLink vs PCI-E with NVIDIA Tesla P100 GPUs on OpenPOWER Servers

The new NVIDIA Tesla P100 GPUs are available with both PCI-Express and NVLink connectivity. How do these two types of connectivity compare? This post provides a rundown of NVLink vs PCI-E and explores the benefits of NVIDIA’s new NVLink technology.

NVIDIA Tesla P100 NVLink 16GB GPU Accelerator (Pascal GP100 SXM2) Up Close

The NVIDIA Tesla P100 NVLink GPUs are a big advancement. For the first time, the GPU is stepping outside the traditional “add in card” design. No longer tied to the fixed specifications of PCI-Express cards, NVIDIA’s engineers have designed a … Continue reading

NVIDIA Tesla P100 PCI-E 16GB GPU Accelerator (Pascal GP100) Up Close

NVIDIA’s new Tesla P100 PCI-E GPU is a big step up for HPC users, and for GPU users in general. Although other workloads have been leveraging the newer “Maxwell” architecture, HPC applications have been using “Kepler” GPUs for a couple … Continue reading

More Tips on OpenACC Acceleration

One blog post may not be enough to present all tips for performance acceleration using OpenACC. So here, more tips on OpenACC acceleration are provided, complementing our previous blog post on accelerating code with OpenACC. Further tips discussed here are: … Continue reading

NVIDIA Tesla M40 24GB GPU Accelerator (Maxwell GM200) Up Close

NVIDIA has announced a new version of their popular Tesla M40 GPU – one with 24GB of high-speed GDDR5 memory. The name hasn’t really changed – the new GPU is named NVIDIA Tesla M40 24GB. If you are curious about … Continue reading

Accelerating Code with OpenACC and the NVIDIA Visual Profiler

Comprised of a set of compiler directives, OpenACC was created to accelerate code using the many streaming multiprocessors (SM) present on a GPU. Similar to how OpenMP is used for accelerating code on multicore CPUs, OpenACC can accelerate code on … Continue reading

NVIDIA Tesla M40 12GB GPU Accelerator (Maxwell GM200) Up Close

With the release of Tesla M40, NVIDIA continues to diversify its professional compute GPU lineup. Designed specifically for Deep Learning applications, the M40 provides 7 TFLOPS of single-precision floating point performance and 12GB of high-speed GDDR5 memory. It works extremely … Continue reading

Caffe Deep Learning Tutorial using NVIDIA DIGITS on Tesla K80 & K40 GPUs

In this Caffe deep learning tutorial, we will show how to use DIGITS in order to train a classifier on a small image set.  Along the way, we’ll see how to adjust certain run-time parameters, such as the learning rate, … Continue reading

Introducing the NVIDIA Tesla K80 GPU Accelerator (Kepler GK210)

NVIDIA has once again raised the bar on GPU computing with the release of the new Tesla K80 GPU accelerator.  With up to 8.74 TFLOPS of single-precision performance with GPU Boost, the Tesla K80 has massive capability and leading density. … Continue reading