GPU Test Drive

Verify the benefits of GPU-acceleration for your workloads


GPU-Accelerated Applications Available for Testing


  • TensorFlow with Keras
  • PyTorch, MXNet, and Caffe2 deep learning frameworks
  • RAPIDS for data science and analytics on GPUs
  • NVIDIA DIGITS Deep Learning training system
  • nvBowtie (GPU-accelerated Bowtie2)
  • MATLAB runtime (users may execute pre-compiled MATLAB applications)
  • NAMD (with multicore, MPI and CUDA builds)
  • Quantum Espresso
  • TeraChem
  • VMD
  • HOOMD-blue (single and double precision)


Available Libraries

  • FFTW3 (single, double, and quad precision builds)
  • HDF5
  • OpenBLAS
  • OpenCV
  • Python with H5py, NumPy, pandas, PyCUDA, pydot, scikit-image, scikit-learn, SciPy, SymPy, and more

MPI & Compilers

MPI & Compiler Software

  • OpenMPI
  • GNU GCC Compiler Collection (multiple versions, as needed) Provides C, C++ and Fortran compilers.
  • Intel Parallel Studio XE Cluster Edition (multiple versions, as needed) Provides C, C++ and Fortran Compilers; Integrated Performance Primitives (IPP), Math Kernel Library (MKL), Clik Plus, Threading Building Blocks (TBB), MPI Library, MPI Benchmarks, Trace Analyzer & Collector, VTune Amplifier XE, Inspector XE, Advisor XE
  • PGI Accelerator Fortran/C/C++ Server (multiple versions, as needed) Provides Portland Group C, C++ and Fortran compilers. GPU-acceleration is supported via CUDA Fortran and OpenACC.
  • AMD Optimizing C/C++ Compiler (AOCC) Provides C, C++, and Fortran compilers with optimizations for the latest AMD EPYC CPUs.

Systems Information

Systems Available for Testing

Photograph of Microway HPC clusters of various sizes and configurations Microway offers a Linux-based benchmark cluster for customers to test GPU-enabled applications. The cluster includes:

  • NVIDIA DGX-2 deep learning system (with sixteen NVLink-connected Tesla V100 GPUs)
  • Microway NumberSmasher, Navion and OpenPOWER GPU Nodes
  • CentOS Linux (with support for Singularity images)
  • Pre-configured GPU-enabled software packages
  • Alternate test configurations available upon request.

Custom NumberSmasher (Xeon) and Navion (EPYC) Tesla GPU Nodes include

  • Up to Four NVIDIA Tesla V100 PCI-E or NVLink GPUs per node (Tesla T4 also available)
  • Two 14-core Intel Xeon Gold 6132 “Skylake” CPUs in NumberSmasher nodes
    Intel Xeon 6142 and Xeon 6144 CPUs available upon request
  • Two 32- or 64-core AMD EPYC CPUs in Navion nodes
  • Up to 384GB DDR4 memory in each node
  • EDR InfiniBand HCAs and switching
  • Up to 62 TFLOPS single- and 31 TFLOPS double-precision GPU performance per node
    with up to 500 TensorTFLOPS via NVIDIA TensorCores


    Your Information

    Name (required)


    E-mail (required)



    How Did You Hear About Us


    Benchmark Details

    Application(s) (required)

    Operating System

    Timeframe for Testing

    Additional Requirements/Comments

    Why GPUs?

    Unlike traditional CPUs, which focus on general-purpose software applications, Tesla GPUs are designed specifically to provide the highest compute performance possible. For many applications, a GPU-accelerated system will be 5X to 25X times faster than a CPU-only system.

    The Tesla V100 GPUs are the latest and fastest accelerators. Based on the “Volta” architecture, they feature:

    Improved compute performance per GPU

    Up to 7.8 TFLOPS double- and 15.7 TFLOPS single-precision floating-point performance

    Faster GPU memory

    High-bandwidth HBM2 memory provides a 3X improvement over older GPUs

    Faster connectivity

    NVLink provides 5X~10X faster transfers than PCI-Express

    Volta Unified Memory

    Allows GPU applications to directly access the memory of all GPUs and all of system memory

    Direct CPU-to-GPU NVLink connectivity

    OpenPOWER systems support NVLink tranfers between the CPUs and GPUs

    Try today on advanced, fully integrated hardware

    Whether you use community-built code or have in-house GPU-accelerated applications, we are offering remote benchmarking time on our latest hardware. This includes NVIDIA Tesla V100 and P100 GPUs with over 5X the performance of previous Tesla GPUs.

    See how fast your code can run

    To log in and test your code, register above. After registration, you will receive an email with instructions. For any questions, please email

    Tesla GPU Accelerated Applications

    NVIDIA Tesla GPU compute processors accelerate many common scientific codes – AMBER, NAMD and LAMMPS are just a few of the applications enjoying significant speed-ups. You can run your own code or one of the preloaded applications.

    Read Our Blog on GPU Benchmarking

    Comments are closed.