, , ,

Benchmark MATLAB GPU Acceleration on NVIDIA Tesla K40 GPUs

MATLAB solving a second order wave equation on Tesla GPUs

MATLAB is a well-known and widely-used application – and for good reason. It functions as a powerful, yet easy-to-use, platform for technical computing. With support for a variety of parallel execution methods, MATLAB also performs well. Support for running MATLAB on GPUs has been built-in for a couple years, with better support in each release. If you haven’t tried yet, take this opportunity to test MATLAB performance on GPUs. Microway’s GPU Test Drive makes the process quick and easy. As we’ll show in this post, you can expect to see 3X to 6X performance increases for many tasks (with 30X to 60X speedups on select workloads).

Access a Compute Node with GPU-accelerated MATLAB

Getting started with MATLAB on our GPU cluster is easy: complete this form to sign up for MATLAB GPU benchmarking. We will send you an e-mail with detailed instructions for logging in and starting up MATLAB. Once you’re in, all you need to do is click the MATLAB icon and the latest version of GPU-Accelerated MATLAB will pop up:
Mathworks MATLAB R2014b splashscreen

We use NoMachine to export the graphical sessions from our cluster to your local PC/laptop. This makes login extremely user-friendly, ensures your interactive session performs well and provides a built-in method for file transfers in and out of the GPU cluster. MATLAB is fairly well-known for performing sluggishly over standard Unix/Linux graphical sessions (e.g., X11 forwarding, VNC), but you’ll have no such issues here.

You’ll be dropped into a standard MATLAB workspace. A variety of parallelized demonstrations of GPU usage are included with MATLAB. Pick one and give it a try! You can type paralleldemo_gpu and then hit <TAB> to see the full list of options.

Main MATLAB R2014b window

Measure MATLAB GPU Speedups

Below we show the output from several of the built-in MATLAB parallel GPU demos. A few are text-only, but several include a graphical component or performance plot. The first example runs a quick test on memory transfer speeds and computational throughput. Results from both the GPU and the host (CPUs) are shown:

>> paralleldemo_gpu_benchmark
Using a Tesla K40m GPU.
Achieved peak send speed of 3.44069 GB/s
Achieved peak gather speed of 2.20036 GB/s
Achieved peak read+write speed on the GPU: 233.613 GB/s
Achieved peak read+write speed on the host: 12.9773 GB/s
Achieved peak calculation rates of 398.9 GFLOPS (host), 1345.8 GFLOPS (GPU)

Note that the host results will be impacted by the number of local workers available in the Parallel Computing Toolbox. Since version R2011b, the default has been limited to 12 threads/CPU cores. With the release of R2014a, Mathworks removed that limit. For these tests we changed the number of workers to 20 in the Parallel Preferences dialog box.

The next demo generates plots of the speedup between matrix multiplications on dual 10-core Xeon CPUs versus a single NVIDIA Tesla K40 GPU. Both single-precision and double-precision floating-point calculations were run.

GPU-Accelerated Stencil Operations

MATLAB also includes a couple of Stencil Operation demos running on a GPU. These include both a “generic” implementation and an optimized implementation using GPU shared & texture memory. As shown below, MATLAB GPU speedups can be 30+ times faster than MATLAB on CPUs with properly-optimized algorithms.

>> paralleldemo_gpu_mexstencil
Average time on the GPU: 1.119ms per generation
Average time of 0.038ms per generation (29.4x faster).
Average time of 0.019ms per generation (58.9x faster).
First version using gpuArray:  1.119ms per generation.
MEX with shared memory: 0.038ms per generation (29.4x faster).
MEX with texture memory: 0.019ms per generation (58.9x faster).

Running your own test of MATLAB GPU speedups

To see a list of other useful demos, take a look at the GPU-accelerated examples on Mathworks FileExchange. You’ll find a large number of useful demonstrations, including:

  • GPU acceleration for FFTs
  • Heat transfer equations
  • Navier-Stokes equations for incompressible fluids
  • Anisotropic Diffusion
  • Gradient Vector Flow (GVF) force field calculation
  • 3D linear and trilinear interpolation
  • more than 60 others

Also consider that hundreds of MATLAB’s standard functions support GPU acceleration. . Utilizing these capabilities is quite straightforward: your data must be loaded into a gpuArray. With this done, pass the gpuArray to any of MATLAB’s standard functions and the operations will be carried out on the GPU!

MATLAB paramSweep demo

Will GPU acceleration speed up your research?

With our pre-configured GPU cluster, running MATLAB on high-performance GPUs is as easy as running it on your own workstation. Find out for yourself how much faster you’ll be able to work if you add GPUs to your toolbelt. Sign up for a GPU Test Drive today!


Featured Illustration:

“Solving 2nd Order Wave Equation on the GPU Using Spectral Methods” by Jiro Doke
Mathworks MATLAB Central

You May Also Like