World's Fastest GPUs

GPU Test Drive

Verify the benefits of GPU-acceleration for your workloads

GPU-Accelerated Applications Available for Testing


Available Libraries

  • NVIDIA CUDA versions 5.0, 5.5, 6.0, 6.5, 7.0, 7.5
  • NVIDIA cuDNN v2, v3
  • FFTW3 (single and double precision builds)
  • HDF5
  • OpenBLAS
  • OpenCV
  • Python 2.7.9 with H5py, NumPy, pandas, PyCUDA, pydot, scikit-image, scikit-learn, SciPy, SymPy, Theano and more
  • MATLAB toolboxes: Computer Vision System, Control System, Econometrics, Financial, Image Processing, Neural Network, Optimization, Parallel Computing Toolbox (PCT), Signal Processing, Statistics

MPI & Compiler Software

  • MVAPICH2 versions 1.9, 2.0, 2.1
  • OpenMPI versions 1.7.x, 1.8.x
  • GNU GCC Compiler Collection (multiple versions, as needed)
    Provides C, C++ and Fortran compilers.
  • Intel Parallel Studio XE Cluster Edition (multiple versions, as needed)
    Provides C, C++ and Fortran Compilers; Integrated Performance Primitives (IPP), Math Kernel Library (MKL), Clik Plus, Threading Building Blocks (TBB), MPI Library, MPI Benchmarks, Trace Analyzer & Collector, VTune Amplifier XE, Inspector XE, Advisor XE
  • PGI Accelerator Fortran/C/C++ Server (multiple versions, as needed)
    Provides Portland Group C, C++ and Fortran compilers. GPU-acceleration is supported via CUDA Fortran and OpenACC.

Systems Available for Testing

Photograph of Microway HPC clusters of various sizes and configurations
Microway offers a Linux and Windows benchmark cluster for customers to test GPU-enabled applications. The cluster includes:

  • Microway NumberSmasher GPU Nodes
  • Two NVIDIA Tesla K80, K40 or K20 GPUs per node
  • Professional Graphics – NVIDIA Quadro M4000
  • Two 12-core Intel Xeon E5-2600v3 series “Haswell” CPUs in each node
  • Intel Direct I/O with PCI-E 3.0 support
  • FDR InfiniBand HCAs and switching
  • Over 16 TFLOPS Single and 5 TFLOPS Double Precision GPU performance per node
  • CentOS Linux or Windows 8.1*
  • Pre-configured GPU-enabled software packages
  • Alternate test configurations available upon request.

*Windows 8.1 users must provide their own applications.


Your Information

Name (required)


E-mail (required)



How Did You Hear About Us


Benchmark Details


Operating System

Timeframe for Testing

Additional Requirements/Comments

Why GPUs?

Unlike traditional CPUs, which focus on general-purpose software applications, Tesla GPUs are designed specifically to provide the highest compute performance possible. The Tesla K40 GPU is the latest and fastest accelerator. Based on the Kepler architecture, it features:

  • 12 GB of memory – enabling 2x larger data sets
  • GPU Boost – allowing power headroom to be converted into user-controlled performance boost

NVIDIA Tesla K40 Application Performance Graph for CHROMA, AMBER, QMCPACK and SPECFEM3D

Try today on advanced, fully integrated hardware

Whether you use community-built code or have in-house GPU-accelerated applications, we are offering you remote benchmarking time on our latest hardware. This includes NVIDIA Tesla K40 GPUs with over 3X the performance of previous Tesla GPUs.

See how fast your code can run

To log in and test your code, register above. After registration, you will receive an email with instructions. For any questions, please email

Tesla GPU Accelerated Applications

NVIDIA Tesla GPU compute processors accelerate many common scientific codes – AMBER, NAMD and LAMMPS are just a few of the applications enjoying significant speed-ups. You can run your own code or one of the preloaded applications.

With Tesla M2090 GPUs, AMBER users in university departments can obtain application performance that outstrips what is even possible with extensive supercomputing access.
-Ross Walker, Assistant Research Professor at San Diego Computer Center

Read Our Blog on GPU Benchmarking

Comments are closed.