In-Depth Comparison of NVIDIA Quadro “Turing” GPU Accelerators

Articles > In-Depth Comparison of NVIDIA Quadro “Turing” GPU Accelerators
This article provides in-depth details of the NVIDIA Quadro RTX “Turing” GPUs. NVIDIA “Turing” GPUs bring an evolved core architecture and add dedicated ray tracing units to the previous-generation “Volta” architecture. Turing GPUs began shipping in late 2018.

Important features available in the “Turing” GPU architecture include:

  • New RT Ray Tracing Cores for the first realtime ray-tracing performance
  • Evolved Deep Learning performance with over 130 Tensor TFLOPS (training) and and 500 TOPS Int4 (inference) throughput
  • NVLink 2.0 between GPUs—when optional NVLink bridges are added—supporting up to 2 bricks and up to 100GB/sec bidirectional bandwidth
  • New GDDR6 Memory with a substantial improvement in memory performance compared to previous-generation GPUs.

Quadro “Turing” GPU Specifications

The table below summarizes the features of the available Quadro Turing GPU Accelerators. To learn more about these products, or to find out how best to leverage their capabilities, please speak with an HPC expert.

Feature Quadro RTX 8000 Quadro RTX 6000 Quadro RTX 5000 Quadro RTX 4000
GPU Chip(s) Turing, TU102 Turing, TU104 Turing, TU106
TensorFLOPS 130.5 Tensor TFLOPS* 89.2 Tensor TFLOPS* 57.0 Tensor TFLOPS*
Integer Operations (INT4) 522 TOPS* 356.8 TOPS* Unknown
Integer Operations (INT8) 261 TOPS* 178.4 TOPS* Unknown
Half Precision (FP16) 32.6 TFLOPS 22.3 TFLOPS 14.2 TFLOPS
Single Precision (FP32) 16.3 TFLOPS* 11.2 TFLOPS* 7.1 TFLOPS*
Double Precision (FP64) .509 TFLOPS* .350 TFLOPS* .222 TFLOPS*
Ray Tracing 10 GigaRays/s 8 GigaRays/sec 6 GigaRays/sec
# of CUDA Cores 4608 3072 2034
# of Turing Tensor Cores 576 384 288
# of SM Units 72 48 36
# of RT Cores 72 48 36
GPU Base Clock 1455 Mhz 1620 Mhz Unknown Mhz
GPU Boost Clock 1770 Mhz 1815 Mhz Unknown Mhz
GDDR6 Memory 48GB 24GB 16GB 8GB
Memory Bandwidth 672 GB/sec 448 GB/sec 416 GB/sec
Interconnect PCI-E 3.0 + optional NVLink 2.0 (2 bricks) PCI-E 3.0 + optional NVLink 2.0 (1 brick) PCI-E 3.0
Theoretical transfer bandwidth (bidirectional) 100 GB/s NVLink
32GB/s PCI-E x16 3.0
50 GB/s NVLink
32GB/s PCI-E x16 3.0
32GB/s PCI-E x16 3.0
Achievable transfer bandwidth ~94 GB/s NVLink
~12 GB/s PCI-E x16 3.0
~12 GB/s PCI-E x16 3.0
GPU Boost Support Yes – Dynamic
Workstation Support yes
Server Support Yes, with passive GPU version specific server models only
Wattage (TDP) 295W 265W 160W
Cooling Type Active† Active

* FLOPS and TOPS calculations are presented at Max Boost
† Passively-cooled models are available with slightly reduced clock speeds

Category: Tags:


Comments are closed.