Purpose-built Appliance for Deep Learning, with Tesla V100 GPUs
As Deep Learning enters the mainstream, NVIDIA DGX-1 is uniquely positioned to provide the best performance when training neural networks and running production-scale classification workloads. To be successful, data scientists and artificial intelligence researchers require quick iterations of their neural network models. The NVIDIA DGX-1 delivers the fastest performance available in the world.
Each appliance arrives fully integrated with NVIDIA’s Deep Learning software stack, which includes:
- Industry-leading software frameworks: Caffe, TensorFlow, Theano, & Torch
- Easy-to-use NVIDIA GPU Cloud Container software stack with everything you need to train
- Optimized, GPU-Accelerated neural network algorithms from the cuDNN library
Additional frameworks and software updates will be released via NVIDIA’s online application repository.
Microway is an NVIDIA Elite Solution Provider and one of only a few resellers authorized to offer this product. The Deep Learning Appliance must be purchased with DGX-1 system support as part of a fully-integrated, turn-key system designed and supported by NVIDIA and Microway’s HPC experts.
Microway also offers optional DGX-1 services including: DGX datacenter site planning estimation, onsite installation, job execution script creation, and partner-provided Deep Learning data preparation consultancy.
- Arrives with fully-integrated Deep Learning libraries and frameworks
- Software stack includes:
- DIGITS training system
- NVIDIA Deep Learning SDK with CUDA & cuDNN
- Cloud management software/services:
- NVIDIA NGC Portal (cloud or onsite)
- Online application repository with the major deep learning frameworks
- NVDocker containerized app deployment
- Managed app container creation and deployment
- Multi-Node management with telemetry, monitoring and alerts
- 8 NVIDIA Tesla V100 “Volta” GPUs
- 40,960 NVIDIA CUDA cores, total
- Total of 256GB high-bandwidth GPU memory
- 60 TFLOPS double-precision, 120 TFLOPS single-precision, 960 TensorTFLOPS with Tesla V100’s new Tensor Unit
- NVIDIA-certified & supported software stack for Deep Learning workloads
- Two 20-core Intel Xeon E5-2698v4 CPUs
- 512GB DDR4 2133MHz System Memory
- Dual X540 10GbE Ethernet ports (10GBase-T RJ45 ports)
- Four Mellanox ConnectX-4 100Gbps EDR InfiniBand ports
- One Gigabit Ethernet management port
- Four 1.92TB SSD in RAID0 (High-Speed Storage Cache)
- 3U Rackmount Form Factor (for standard 19-inch rack)
- Redundant, Hot-Swap power supplies (four IEC C13 208V power connections on rear)
- Power Consumption: 3200W at full load
- Ubuntu Server Linux operating system
Please note that the ~35″ depth of this chassis (866mm) typically requires an extended-depth rackmount cabinet. Speak with one of our experts to determine if your existing rack is compatible.
- Onsite installation and integration by Microway’s HPC experts
- High-speed storage connectivity for data sets ranging from TeraBytes to PetaBytes in size
NVIDIA DGX-1 Part Numbers
920-22787-2511-000 – NVIDIA DGX-1 System for Commercial and Government institutions with Tesla V100 32GB(also available with EDU Discounts)
920-22787-2510-000 – NVIDIA DGX-1 System for Commercial and Government institutions with Tesla V100 16GB(also available with EDU Discounts)
920-22787-2500-000 – NVIDIA DGX-1 System for Commercial and Government institutions with Tesla P100
920-22787-25ED-000 – NVIDIA DGX-1 System for Academic institutions with Tesla P100
DGX Site Planning
A Microway Solutions Architect will provide remote assistance to you and your facilities staff in planning for the DGX-1’s unique power and cooling requirements. This includes rack diagramming with airflow and power cabling notation, participation in remote conference calls, and answering queries from facilities staff about support requirements of the DGX-1 hardware.
All DGX OS and container software will be installed, firmware upgraded to the latest versions, desired DGX-containers installed, and deep learning test jobs run. Customers may supply questions to our experts. In some cases, factory-trained Microway experts may travel to your datacenter.
Container or Job Execution Script Creation
Creating an effective workflow is key to your success with any hardware resource. DGX-1’s unique container architecture means proper container management and even job execution scripts are a necessity. Microway experts will assist you in creating: your default DGX-1 containers, scripts to orchestrate the DGX-1 containers for multiple users in your organization, and methods of dynamically allocating GPUs as required to containers.
Deep Learning Data Preparation
An overwhelming majority of the time in a deep learning project is spent on the preparation of data. Microway’s data-science consultant partners will engage with you to: to create a custom scope of work, determine the best means to prepare your data for deep learning, create the pre-processing algorithms, assist in the pre-process of the training data, and optionally determine effective means of measurement for the overall DL project. Additional services also available.
|NVIDIA DGX-1 Support|
|NVIDIA Cloud Management||✔|
|NVIDIA DGX-1 Software Upgrades||✔|
|NVIDIA DGX-1 Driver Updates||✔|
|NVIDIA DGX-1 Firmware Updates||✔|
|Hardware Support||1 or 3 year subscription|
|Hardware SLA (replacement parts shipped)||1 business day|
|Online Access||NVIDIA Enterprise Support Portal|
What’s Included in NVIDIA DGX-1 Support
- Access to the latest software updates and upgrades
- Direct communication with NVIDIA technical experts
- NVIDIA cloud management: container repository, container management, job scheduling, and system performance monitoring and new software updates
- Searchable NVIDIA knowledge base with how-to articles, application notes and product documentation
- Rapid response and timely issue resolution through support portal and 24×7 phone access
- Lifecycle support for NVIDIA DGX-1 Deep Learning software
- Hardware support, firmware upgrades, diagnostics and remote and onsite resolution of hardware issues
- Next day shipment for replacement parts
System Price: $111,601 (academic pricing, includes 1 year support) to $159,430 (commercial, includes 1 year support)
Final pricing depends upon configuration and any applicable discounts, including education or NVIDIA Inception. Request a custom quotation to receive your applicable discounts.
NVIDIA requires all DGX purchases to include a support services contract. Ensure all quotes you receive include this mandatory DGX support.
Call a Microway Sales Engineer for Assistance : 508.746.7341 or
Click Here to Request a Quotation.
HPC workloads on NVIDIA DGX-1
The DGX-1 system can also be leveraged by users in other fields – many HPC workloads are well-suited to this platform. The eight built-in NVIDIA Tesla GPUs offer significant increases in floating-point throughput and memory bandwidth, both of which are critical for performance-demanding HPC workloads.
Connectivity between system components has also been improved, which is of vital importance for HPC workloads. The Volta architecture’s support for completely unified memory allows all GPUs to directly access each other’s memory, as well as directly access the memory on the host CPUs. To support such capabilities, the DGX-1 provides robust links between all system components, as shown in the block diagram below. The eight GPUs are connected in a hybrid cube mesh.