Supercharge your next cluster with NVIDIA® H100, A100, L40, or A30 GPUs
Microway NVIDIA GPU Clusters
Microway’s fully integrated NVIDIA GPU clusters deliver supercomputing & AI performance at a lower power, lower cost, and using many fewer systems than CPU-only equivalents. These clusters are powered by NVIDIA H100, A100, L40, or A30 GPUs. NVIDIA datacenter GPUs scale to solve the world’s most important computing challenges more quickly and accurately.
Successfully deployed in demanding applications at research institutes, universities, and enterprises, NVIDIA GPUs power the most powerful supercomputers worldwide.
Installed Software
A Microway NVIDIA GPU cluster comes installed, integrated, and tested with:
- Linux distribution of your choice, including Red Hat, Rocky, Ubuntu, Debian, openSUSE or Gentoo
- NVIDIA HPC SDK and CUDA® SDK
- Optional NVIDIA AI Enterprise for AI Clusters
- NVIDIA Bright Cluster Manager, OpenHPC, or Microway Cluster Management Software (MCMS™) integrated with optional MPI Link-Checker™ Fabric Validation Suite
- Optional User-level application and library installations
What Makes a Microway Cluster Different?
Expert Guidance, Expert Design
Share the details of your application or code. Microway experts will help you evaluate hardware platforms for your application. Then, they’ll help design a custom configuration tuned to your specific needs and budget.
Intensive Burn-in Testing
Every Microway cluster receives up to 1 week of burn-in testing. This includes GPU stress tests designed to identify hardware faults and “infant mortality.” Your cluster is qualified at our facility, not yours. So you can get to work faster.
Complete Software Integration
Our team integrates all the drivers, packages, and SDKs that enable you to start working from day 1. Your cluster is also delivered with GPU-aware cluster management software and schedulers tested on real accelerated HPC/AI jobs.
Our Experience in GPU Systems
We’ve been delivering NVIDIA GPU Clusters for longer than NVIDIA datacenter GPUs have existed. Microway has hundreds of satisfied customers & thousands of GPUs in the field, and we’ll apply that expertise to your successful deployment.
Sample Microway NVIDIA GPU Cluster Specifications
High Density PCI-E
Two CPUs and Four NVIDIA A100 or A30 GPUs in NumberSmasher 1U Server
GPUs per Node | (4) NVIDIA A100 or NVIDIA A30 |
---|---|
Sample Cluster Size | One fully-integrated 42U rackmount cabinet with 40 Nodes (160 GPUs) |
Base Platform | NumberSmasher 1U 4 GPU Server |
System Memory per Node | Up to 4 TB DDR4 |
Total GPU Memory per Node |
320GB HBM2 (NVIDIA A100), 96GB HBM2 (NVIDIA A30) |
Head Node | Dual Intel Xeon Scalable Processor Server (1U – 4U) with up to 8 TB memory Optional NVIDIA RTX™ Professional Graphics |
Storage | Head Node: up to 648 TB Compute Nodes: up to 8 TB Optional Storage Servers or Parallel HPC Storage System |
Ethernet Network | Dual 10 Gigabit Ethernet built-in Optional 100Gb Ethernet |
HPC Interconnect (optional) | ConnectX-6® 200Gb HDR or ConnectX-5 100Gb EDR InfiniBand Fabric |
Cabinet | 42U APC NetShelter Cabinet (extra-depth model required due to chassis depth) |
Green HPC Features | High-efficiency (80PLUS Platinum-Level) power supplies Software/firmware to reduce power consumption on idle cores Optional liquid-cooled rack doors (for thermally-neutral HPC) |
High Density with NVLink
Two CPUs and Four NVIDIA NVLink® GPUs with 2U Navion compute nodes
GPUs per Node | (4) NVIDIA A100 with NVLink 3.0 |
---|---|
Sample Cluster Size | One fully-integrated 42U rackmount cabinet with 20 Nodes (80 GPUs) |
Base Platform | Navion 2U NVIDIA A100 GPU Server with NVLink Dense configuration with the latest NVIDIA A100 GPUs and NVLink 3.0 interconnect |
System Memory per Node | Up to 8 TB DDR4 |
Total GPU Memory per Node |
160GB HBM2 or 320GB HBM2e |
Head Node | Dual AMD EPYC Server (1U – 4U) with up to 8 TB memory Optional NVIDIA RTX™ Professional Graphics |
Storage | Head Node: up to 648 TB Compute Nodes: up to 8 TB Optional Storage Servers or Parallel HPC Storage System |
Ethernet Network | Dual 10 Gigabit Ethernet built-in Optional 100Gb Ethernet |
HPC Interconnect (optional) | ConnectX-6® 200Gb HDR or ConnectX-5 100Gb EDR InfiniBand Fabric |
Cabinet | 42U APC NetShelter Cabinet (extra-depth model required due to chassis depth) |
Green HPC Features | High-efficiency (80PLUS Platinum-Level) power supplies Software/firmware to reduce power consumption on idle cores Optional liquid-cooled rack doors (for thermally-neutral HPC) |
Balanced & Flexible
Two CPUs and up to four mixed GPUs with 4U NumberSmasher compute nodes
GPUs per Node | (4) NVIDIA H100, A100, A30, or L40 GPUs |
---|---|
Sample Cluster Size | One fully-integrated 42U rackmount cabinet with 10 Nodes (40 GPUs) |
Base Platform | NumberSmasher 4U Tower/GPU Server Support NVIDIA Datacenter GPUs for compute, RTX Professional GPUs for visualization, or a mix for as appropriate for your workload. Add other PCI-E devices. |
System Memory per Node | Up to 4 TB DDR5 |
Total GPU Memory per Node |
320GB (NVIDIA H100) 320GB (NVIDIA A100) 96GB (NVIDIA A30) |
Head Node | Dual Xeon Server (1U – 4U) with up to 8 TB memory Optional NVIDIA RTX Professional Graphics/NVIDIA Remote Visualization |
Storage | Head Node: up to 648 TB Compute Nodes: up to 300 TB Optional Storage Servers or Parallel HPC Storage System |
Network | Optional 10Gb or 100Gb Ethernet |
HPC Interconnect (optional) | ConnectX-7 400Gb NDR or ConnectX-6 200Gb HDR InfiniBand Fabric |
Cabinet | 42U APC NetShelter Cabinet |
Green HPC Features | High-efficiency (80PLUS Platinum-Level) power supplies Software/firmware to reduce power consumption on idle cores Optional liquid-cooled rack doors (for thermally-neutral HPC) |
Max GPUs
Two CPUs and up to Ten GPUs with Navion 4U or Octoputer compute nodes
GPUs per Node | 8 or 10 NVIDIA H100, A100, A30 or L40 Optional NVIDIA RTX Professional Graphics/NVIDIA Remote Visualization in additional slot |
---|---|
Sample Cluster Size | One fully-integrated 42U rackmount cabinet with 9 Nodes (72 GPUs for RDMA; 90 GPUs for Density) |
Base Platform |
Navion 4U GPU Server with NVIDIA A100 GPUs Octoputer 4U 8/10 GPU Server |
System Memory per Node | Up to 8 TB DDR4/DDR5 |
Total GPU Memory per Node |
800GB (NVIDIA H100) 800GB (NVIDIA A100) 240GB (NVIDIA A30) |
Head Node | Dual Xeon or AMD EPYC Server (1U – 4U) with up to 8 TB memory Optional NVIDIA RTX Professional Graphics/NVIDIA Remote Visualization |
Storage | Head Node: up to 648 TB Compute Nodes: up to 720 TB Optional Storage Servers or Parallel HPC Storage System |
Network | Dual Gigabit Ethernet built-in Optional 10Gb or 100Gb Ethernet |
HPC Interconnect (optional) | ConnectX-7 400Gb NDR or ConnectX-6 200Gb HDR InfiniBand Fabric |
Cabinet | 42U APC NetShelter Cabinet |
Green HPC Features | High-efficiency (80PLUS Platinum/Titanium-Level) power supplies Software/firmware to reduce power consumption on idle cores Optional liquid-cooled rack doors (for thermally-neutral HPC) |
Max GPUs+NVLink
Two CPUs and 8 GPUs with Navion 4U 8 GPU with NVLink or Octoputer with NVLink compute nodes
GPUs per Node | 8 NVIDIA H100 with NVLink 4.0 8 NVIDIA A100 with NVLink 3.0 |
---|---|
Sample Cluster Size | One fully-integrated 42U rackmount cabinet with 9 Nodes (72 GPUs) |
Base Platform |
Octoputer 6U with NVLink – HGX H100 Navion 4U 8 GPU NVIDIA A100 Server with NVLink – HGX A100 Octoputer 4U with NVLink – HGX A100 |
System Memory per Node | Up to 8 TB DDR4/DDR5 |
Total GPU Memory per Node |
640GB (NVIDIA H100) 640GB (NVIDIA A100) |
Head Node | Dual Xeon Server (1U – 4U) with up to 8 TB memory Optional NVIDIA RTX Professional Graphics/NVIDIA Remote Visualization |
Storage | Head Node: up to 648 TB Compute Nodes: up to 12 TB Optional Storage Servers or Parallel HPC Storage System |
Network | Optional 10Gb or 100Gb Ethernet |
HPC Interconnect (optional) | ConnectX-7 400Gb NDR or ConnectX-6 200Gb HDR InfiniBand Fabric |
Cabinet | 42U APC NetShelter Cabinet |
Green HPC Features | High-efficiency (80PLUS Platinum/Titanium-Level) power supplies Software/firmware to reduce power consumption on idle cores Optional liquid-cooled rack doors (for thermally-neutral HPC) |
CPU:GPU Coherence
Two POWER9 with NVLink CPUs and 4/6 Tesla V100 GPUs with AC922 compute nodes
GPUs per Node | (4) Tesla V100 SXM2.0 (air cooled) (6) Tesla V100 SXM2.0 (liquid cooled) |
---|---|
Sample Cluster Size | One fully-integrated 42U rackmount cabinet with 18 Power Systems AC922 Nodes (64 GPUs and 32 CPUs) |
Base Platform | Power Systems AC922 with Tesla V100 with NVLink nodes World’s first CPU: Tesla GPU Coherence— POWER9 CPU and Tesla V100 GPU share same memory space Only Platform with CPU:GPU NVLink—No PCI-E data bottleneck between POWER9 CPU and Tesla GPU |
System Memory per Node | Up to 2TB DDR4 |
Total GPU Memory per Node |
64GB (4 GPU node, air cooled) 96GB (6 GPU node, liquid cooled) |
Head Node | Dual POWER9 Server (1U – 2U) with up to 1TB memory |
Storage | Head Node: up to 216 TB Compute Nodes: up to 4 TB Optional Storage Servers or Parallel HPC Storage System |
Network | Dual Gigabit Ethernet built-in Optional 10Gb/100Gb Ethernet |
HPC Interconnect (optional) | ConnectX-6 200Gb HDR or ConnectX-5 100Gb EDR InfiniBand Fabric |
Cabinet | 42U APC NetShelter Cabinet |
Green HPC Features | High-efficiency (80PLUS Platinum-Level) power supplies Software/firmware to reduce power consumption on idle cores Optional liquid cooling of nodes Optional liquid-cooled rack doors (for thermally-neutral HPC) |
Supported for Life
Our technicians and sales staff consistently ensure that your entire experience with Microway is handled promptly, creatively, and professionally.
Telephone support is available for the lifetime of your cluster by Microway’s experienced technicians. After the initial warranty period, hardware warranties are offered on an annual basis. Out-of-warranty repairs are available on a time & materials basis.