Graphical Tool for Validating Workstation and Clustered GPUs
Microway’s GPU-Checker utility validates a single GPU or a cluster of GPUs from a single interface. GPUs are automatically detected, queried and tested on each system – the user simply needs to specify a list of host systems to test.
Designed specifically for NVIDIA’s professional Quadro and Tesla GPU products, the tool monitors the health of each GPU while tests are run. Metrics include:
- Correctable and Uncorrectable ECC memory errors
- Retired and Pending memory pages
- Power consumption (compared to TDP)
- Temperature
- Memory and GPU clock speeds
- PCI-Express width and generation
Harnessing the same tools which Microway uses to verify GPU cluster health, GPU-Checker runs each graphics processing unit through a battery of computational and memory-intensive tests. High-intensity stress tests ensure GPU-dense systems will not overheat under heavy loads. Memory-intense modes validate the local and global memory systems. Memory check modes also catch errors on GPUs which are running with ECC disabled.
GPU-Checker supports a variety of run modes:
- Single GPU on local computer
- Multiple GPUs on local computer
- Multiple GPUs on multiple remote computers
- Multiple GPUs on local and multiple remote computers
Questions and Price Inquiries
If you would like to learn more about Microway GPU-Checker please contact one of our HPC experts.