Photo of the NVIDIA DGX-1 GPU-Accelerated Deep Learning Appliance

EOL – NVIDIA DGX-1™ Deep Learning System

Purpose-built Appliance for Deep Learning, with NVIDIA Tesla® V100 GPUs

Note: NVIDIA® announced an End of Sale Program for DGX-1 in Summer 2020. Long term DGX-1 with Tesla V100 support may be ending soon. Unless you are augmenting an existing DGX-1 deployment, we recommend selecting NVIDIA DGX A100™

As Deep Learning enters the mainstream, NVIDIA DGX-1 is uniquely positioned to provide the best performance when training neural networks and running production-scale classification workloads. To be successful, data scientists and artificial intelligence researchers require quick iterations of their neural network models. The NVIDIA DGX-1 delivers the fastest performance available in the world.

Deep Learning Training on DGX-1 with Tesla V100

Each appliance arrives fully integrated with NVIDIA’s Deep Learning software stack, which includes:

  • Industry-leading software frameworks: Caffe, TensorFlow, Theano, & Torch
  • Easy-to-use NVIDIA GPU Cloud Container software stack with everything you need to train
  • Optimized, GPU-Accelerated neural network algorithms from the cuDNN library

Additional frameworks and software updates will be released via NVIDIA’s online application repository.

Microway is an NVIDIA Elite Solution Provider and one of only a few resellers authorized to offer this product. The Deep Learning Appliance must be purchased with DGX-1 system support as part of a fully-integrated, turn-key system designed and supported by NVIDIA and Microway’s HPC experts.

Microway also offers optional DGX-1 services including: DGX datacenter site planning estimation, onsite installation, job execution script creation, and partner-provided Deep Learning data preparation consultancy.

Features

Partner logo - Microway is an NVIDIA Elite Solution Provider

  • Arrives with fully-integrated Deep Learning libraries and frameworks
  • Software stack includes:
    • DIGITS training system
    • NVIDIA Deep Learning SDK with CUDA & cuDNN
  • Cloud management software/services:
    • NVIDIA NGC Portal (cloud or onsite)
    • Online application repository with the major deep learning frameworks
    • NVDocker containerized app deployment
    • Managed app container creation and deployment
    • Multi-Node management with telemetry, monitoring and alerts

Specifications

  • 8 NVIDIA Tesla V100 “Volta” GPUs
  • 40,960 NVIDIA CUDA cores, total
  • Total of 256GB high-bandwidth GPU memory
  • 60 TFLOPS double-precision, 120 TFLOPS single-precision, 960 TensorTFLOPS with Tesla V100’s new Tensor Unit
  • NVIDIA-certified & supported software stack for Deep Learning workloads
  • Two 20-core Intel Xeon E5-2698v4 CPUs
  • 512GB DDR4 2133MHz System Memory
  • Dual X540 10GbE Ethernet ports (10GBase-T RJ45 ports)
  • Four Mellanox ConnectX®-4 100Gbps EDR InfiniBand ports
  • One Gigabit Ethernet management port
  • Four 1.92TB SSD in RAID0 (High-Speed Storage Cache)
  • 3U Rackmount Form Factor (for standard 19-inch rack)
  • Redundant, Hot-Swap power supplies (four IEC C13 208V power connections on rear)
  • Power Consumption: 3200W at full load
  • Ubuntu Server Linux operating system

Please note that the ~35″ depth of this chassis (866mm) typically requires an extended-depth rackmount cabinet. Speak with one of our experts to determine if your existing rack is compatible.

Accessories/Options

  • Onsite installation and integration by Microway’s HPC experts
  • High-speed storage connectivity for data sets ranging from TeraBytes to PetaBytes in size

NVIDIA DGX-1 Part Numbers

920-22787-2511-000NVIDIA DGX-1 System for Commercial and Government institutions with Tesla V100 32GB(also available with EDU Discounts)
920-22787-2510-000NVIDIA DGX-1 System for Commercial and Government institutions with Tesla V100 16GB(also available with EDU Discounts)
920-22787-2500-000NVIDIA DGX-1 System for Commercial and Government institutions with Tesla P100
920-22787-25ED-000NVIDIA DGX-1 System for Academic institutions with Tesla P100

DGX-1 Services

Microway also offers optional DGX-1 services including: DGX datacenter site planning estimation, onsite installation, job execution script creation, and partner-provided Deep Learning data preparation consultancy.

DGX Site Planning

A Microway Solutions Architect will provide remote assistance to you and your facilities staff in planning for the DGX-1’s unique power and cooling requirements. This includes rack diagramming with airflow and power cabling notation, participation in remote conference calls, and answering queries from facilities staff about support requirements of the DGX-1 hardware.

Deployment Services

All DGX OS and container software will be installed, firmware upgraded to the latest versions, desired DGX-containers installed, and deep learning test jobs run. Customers may supply questions to our experts. In some cases, factory-trained Microway experts may travel to your datacenter.

Container or Job Execution Script Creation

Creating an effective workflow is key to your success with any hardware resource. DGX-1’s unique container architecture means proper container management and even job execution scripts are a necessity. Microway experts will assist you in creating: your default DGX-1 containers, scripts to orchestrate the DGX-1 containers for multiple users in your organization, and methods of dynamically allocating GPUs as required to containers.

Deep Learning Data Preparation

An overwhelming majority of the time in a deep learning project is spent on the preparation of data. Microway’s data-science consultant partners will engage with you to: to create a custom scope of work, determine the best means to prepare your data for deep learning, create the pre-processing algorithms, assist in the pre-process of the training data, and optionally determine effective means of measurement for the overall DL project. Additional services also available.

Support

Microway offers optional installation and integration service for DGX-Solutions. NVIDIA DGX-1 support provides you with comprehensive system support and access to NVIDIA’s cloud management portal to get the most comprehensive services from your NVIDIA DGX-1 system. Streamline deep learning experimentation by leveraging containerized application management, launch jobs, monitor status and get software updates with NVIDIA cloud management.

NVIDIA DGX-1 Support
NVIDIA Cloud Management
NVIDIA DGX-1 Software Upgrades
NVIDIA DGX-1 Driver Updates
NVIDIA DGX-1 Firmware Updates
Hardware Support 1 or 3 year subscription
Hardware SLA (replacement parts shipped) 1 business day
Online Access NVIDIA Enterprise Support Portal
Phone Hours 24×7
Knowledgebase

What’s Included in NVIDIA DGX-1 Support

  • Access to the latest software updates and upgrades
  • Direct communication with NVIDIA technical experts
  • NVIDIA cloud management: container repository, container management, job scheduling, and system performance monitoring and new software updates
  • Searchable NVIDIA knowledge base with how-to articles, application notes and product documentation
  • Rapid response and timely issue resolution through support portal and 24×7 phone access
  • Lifecycle support for NVIDIA DGX-1 Deep Learning software
  • Hardware support, firmware upgrades, diagnostics and remote and onsite resolution of hardware issues
  • Next day shipment for replacement parts

Price

System Price: $111,601 (academic pricing, includes 1 year support) to $159,430 (commercial, includes 1 year support)

Final pricing depends upon configuration and any applicable discounts, including education or NVIDIA Inception. Request a custom quotation to receive your applicable discounts.

NVIDIA requires all DGX purchases to include a support services contract. Ensure all quotes you receive include this mandatory DGX support.

Call a Microway Sales Engineer for Assistance : 508.746.7341 or
Click Here to Request a Quotation

HPC workloads on NVIDIA DGX-1

The DGX-1 system can also be leveraged by users in other fields – many HPC workloads are well-suited to this platform. The eight built-in NVIDIA Tesla GPUs offer significant increases in floating-point throughput and memory bandwidth, both of which are critical for performance-demanding HPC workloads.

Connectivity between system components has also been improved, which is of vital importance for HPC workloads. The Volta architecture’s support for completely unified memory allows all GPUs to directly access each other’s memory, as well as directly access the memory on the host CPUs. To support such capabilities, the DGX-1 provides robust links between all system components, as shown in the block diagram below. The eight GPUs are connected in a hybrid cube mesh.

DGX-1 with Tesla V100 Block Diagram

Block diagram of the NVIDIA DGX-1 System

Call a Microway Sales Engineer for Assistance : 508.746.7341 or
Click Here to Request a Quotation

Eliot Eshelman

About Eliot Eshelman

My interests span from astrophysics to bacteriophages; high-performance computers to small spherical magnets. I've been an avid Linux geek (with a focus on HPC) for more than a decade. I work as Microway's Vice President of Strategic Accounts and HPC Initiatives.
Bookmark the permalink.

Comments are closed.