With the release of Tesla M40, NVIDIA continues to diversify its professional compute GPU lineup. Designed specifically for Deep Learning applications, the M40 provides 7 TFLOPS of single-precision floating point performance and 12GB of high-speed GDDR5 memory. It works extremely well with the popular Deep Learning software frameworks and may also find its way into other industries that need single-precision accuracy.
The Tesla M40 is also notable for being the first Tesla GPU to be based upon NVIDIA’s “Maxwell” GPU architecture. “Maxwell” provides excellent performance per watt, as evidenced by the fact that this GPU provides 7 TFLOPS within a 250W power envelope.
Maximum single-GPU performance: Tesla M40 12GB GPU
Available in Microway NumberSmasher GPU Servers and GPU Clusters
Specifications
- 3072 CUDA GPU cores (GM200)
- 7.0 TFLOPS single; 0.21 TFLOPS double-precision
- 12GB GDDR5 memory
- Memory bandwidth up to 288 GB/s
- PCI-E x16 Gen3 interface to system
- Dynamic GPU Boost for optimal clock speeds
- Passive heatsink design for installation in qualified GPU servers
As with all other modern Tesla GPUs, you should expect it to be able to max out the PCI-E 3.0 bus to achieve ~12GB/sec of data transfers between the system and each GPU: