DGX POD


DGX POD

DGX+Parallel Storage Building Blocks for Scale Out AI

DGX POD is an NVIDIA-validated building block of AI Compute & Storage for scale-out deployments.

Designed for the largest datasets, DGX POD solutions enable training at vastly improved performance compared to single systems. DGX-POD also includes the AI data-plane/storage with the capacity for training datasets, expandability for growth, and the speed that can keep up with AI workloads.

Why DGX POD?

Fully Integrated Deployment

DGX POD solutions come with the DGX AI Appliance and Parallel storage fully integrated together

Extendable

Build complete racks with multiple DGX POD configurations

Validated Configuration

DGX POD solutions have been validated to deliver high storage throughput

Performance Improves Over Time

DGX POD automatically receives performance improvements from NVIDIA

Mid-Scale and Single Rack Solutions

1:1 DGX A100 POD with AI200X

1 NVIDIA DGX A100 with DDN AI200X

First Deployment of AI Compute & AI Ready-Parallel Storage

Specifications

  • 5 PFLOPS of AI Performance
  • DDN AI200X appliance with throughput up to 24GB/sec and 1.5 million IOPS (various data capacities available)
  • Total of 320GB or 640GB of GPU memory
  • Mellanox 200Gb HDR InfiniBand
  • NGC Containers with NVIDIA-optimized performance
  • Full parallel filesystem and DDN Management GUI
  • Seamless AI compute scaling: add DGX systems to increase AI performance (existing DDN AI200X has headroom to continue to scale data throughput)
  • Seamless storage throughput & capacity scaling: add AI200X appliances to double data bandwidth or grow data capacity (DDN EXAScaler Lustre filesystem is already built for expansion)
  • Superior Performance with GPUDirect Storage: realize a direct data path from storage to GPU over InfiniBand. Delivers faster performance for multiple users on a single DGX and on scale-out multi-DGX deployments
  • Validated Configuration: Scale-out AI performance of design is validated by NVIDIA and DDN

Request Quotation

2:1 DGX A100 POD with AI400X

2 NVIDIA DGX A100s with DDN AI400X

Scale-Up AI Compute & AI Ready-Parallel Storage

Specifications

  • 10 PFLOPS of AI Performance
  • DDN AI400X appliance with throughput up to 48GB/sec and 3 million IOPS (various data capacities available)
  • Total of 640GB or 1280GB of GPU memory
  • Mellanox 200Gb HDR InfiniBand
  • NGC Containers with NVIDIA-optimized performance
  • Full parallel filesystem and DDN Management GUI
  • Seamless AI compute scaling: add DGX systems to increase AI performance (existing DDN AI400X has headroom to continue to scale data throughput)
  • Seamless storage throughput & capacity scaling: add AI400X appliances to double data bandwidth or grow data capacity (DDN EXAScaler Lustre filesystem is already built for expansion)
  • Superior Performance with GPUDirect Storage: realize a direct data path from storage to GPU over InfiniBand. Delivers faster performance for multiple users on a single DGX and on scale-out multi-DGX deployments
  • Validated Configuration: Scale-out AI performance of design is validated by NVIDIA and DDN

Request Quotation

4:2 DGX A100 POD with AI400X

4 NVIDIA DGX A100s with 2 DDN AI400X

Full Rack AI Compute & AI Ready-Parallel Storage

Specifications

  • 20 Tensor PFLOPS of AI Performance
  • DDN AI400X appliances with throughput up to 96GB/sec and 6 million IOPS (various data capacities available)
  • Total of 1.25TB or 2.5TB of GPU memory
  • Mellanox 200Gb HDR InfiniBand
  • NGC Containers with NVIDIA-optimized performance
  • Full parallel filesystem and DDN Management GUI
  • Seamless AI compute scaling: add DGX systems to increase AI performance (existing DDN AI400X systems have headroom to continue to scale data throughput)
  • Seamless storage throughput & capacity scaling: add AI400X appliances to increase bandwidth or grow data capacity (DDN EXAScaler Lustre filesystem is already built for expansion)
  • Superior Performance with GPUDirect Storage: realize a direct data path from storage to GPU over InfiniBand. Delivers faster performance for multiple users on a single DGX and on scale-out multi-DGX deployments
  • Validated Configuration: Scale-out AI performance of design is validated by NVIDIA and DDN

Request Quotation

Something else?

Custom Parallel Storage Solutions

Looking for another filesystem (Spectrum Scale/GPFS, BeeGFS), scale, or capacity? Let us design a system that meets your needs.

Key Capabilities

  • Capacities to 1PB or beyond
  • Throughput >100-500GB/sec
  • Dynamic capacity expansion
  • Lustre, BeeGFS, or Spectrum Scale (formerly GPFS)

Contact Us

Multi-Rack Solutions and DGX SuperPOD

8:4 DGX A100 POD with AI400X

8 DGX A100 Systems with 4 DDN AI400X

Dual Rack of AI Compute & Ultra-High Throughput Parallel Storage

Specifications

  • 40 PFLOPS of AI Performance
  • DDN AI400X appliances with throughput up to 192GB/sec and 12 million IOPS (various data capacities available)
  • Total of 2.5TB or 5TB of GPU memory
  • Mellanox 200Gb HDR InfiniBand
  • NGC Containers with NVIDIA-optimized performance
  • Full parallel filesystem and DDN Management GUI
  • Superior Performance with GPUDirect Storage: realize a direct data path from storage to GPU over InfiniBand. Delivers faster performance for multiple users on a single DGX and on scale-out multi-DGX deployments
  • Validated Configuration: Scale-out AI performance of design is validated by NVIDIA and DDN

Request Quotation

DGX SuperPOD 20 Node Deployment

20 NVIDIA DGX A100 AI Systems, 7 DDN AI400X

Record-Breaking, Large AI Cluster Building Block

Specifications

  • 100 PFLOPS of AI Performance
  • 7 DDN AI400X appliances with aggregate throughput up to 336GB/sec and 21 million IOPS (various data capacities available)
  • Total of 6.25TB or 12.5TB of GPU memory
  • Mellanox 200Gb HDR InfiniBand
  • NGC Containers with NVIDIA-optimized performance
  • Full parallel filesystem and DDN Management GUI
  • Record Breaking Building Block: Scale-out AI performance of design is the basis of NVIDIA’s World Record Breaking, DGX SuperPOD Deployment
  • Scales up to Massive Deployments deploy multiple 20 node building blocks for immense AI + storage deployments

Request Quotation

Something else?

Custom Parallel Storage Solutions

Looking for another filesystem (Spectrum Scale/GPFS, BeeGFS), scale, or capacity? Let us design a system that meets your needs.

Key Capabilities

  • Multi-PB Storage Capacities
  • Throughput >500GB/sec
  • Dynamic capacity expansion
  • Lustre, Spectrum Scale (formerly GPFS), or BeeGFS

Contact Us

Comments are closed.