Introducing the NVIDIA Tesla K80 GPU Accelerator (Kepler GK210)

NVIDIA has once again raised the bar on GPU computing with the release of the new Tesla K80 GPU accelerator.  With up to 8.74 TFLOPS of single-precision performance with GPU Boost, the Tesla K80 has massive capability and leading density.

NVIDIA Tesla K80

Here are the important performance specifications:

  • Two GK210 chips on a single PCB
  • 4992 total SMX CUDA cores: 2496 on each chip!
  • Total of 24GB GDDR5 memory; aggregate memory bandwidth of 480GB/sec
  • 5.6 TFLOPS single precision, 1.87 TFLOPS double precision
  • 8.74 TFLOPS single precision, 2.91 TFLOPS double precision with GPU Boost
  • 300W TDP

To achieve this performance, Tesla K80 is really two GPUs in one. This Tesla K80 block diagram illustrates how each GK210 GPU has its own dedicated memory and how they communicate at x16 speeds with the PCIe bus using a PCIe switch:

Tesla K80 block diagram

In order to maintain a TDP rating of 300W, the default clock speed is 560MHz , rather than the 745MHz of Tesla K40. Both GPUs, however, have the same boost speed of 875MHz. NVIDIA has also evolved the GPU Boost feature substantially: GPU boost is now dynamically utilized. Rather than 3 manually selected levels, over 10 levels of boost are available for every application run, whenever thermals permit.

Tesla K80 is also remarkable for its density. Packaging two GPUs on a single PCB enables existing or slightly modified server designs to support more Tesla GPUs (2X the density). Balancing clock speeds and CUDA core count within the 300W TDP also delivers the best performance per watt of any Tesla GPU!

Pricing is approximately 30% more than the Tesla K40 GPU.  With some applications achieving performance gain of up to 90% compared to K40, the K80 appears to be a decent price-performance bargain. Keep in mind, though, that your application performance will vary.  As you can see from the graph below, Caffe and CHROMA show substantial improvements with K80 while other applications like GROMACS and CP2K will achieve more modest speedups over K40.

TeslaK80 K40 Comparison Chart

This GPU is passively cooled only, meaning that we are only able to deliver Tesla K80-enabled clusters and servers. A noisy tower/4U is also available—should your environment tolerate it as a workstation.

As always the best way to determine what K80 can do for you is to try it yourself. We encourage you to sign up for a test drive and be one of the first to see how powerful this GPU can be with your code.

You can also contact us to discuss what platforms best suit your needs. From 1U servers with three Tesla K80s to 4U servers with a staggering eight Tesla K80s, we can help design the server or cluster that works best for you.

This entry was posted in Hardware, Test Drive and tagged , , , . Bookmark the permalink.

2 Responses to Introducing the NVIDIA Tesla K80 GPU Accelerator (Kepler GK210)

  1. Avatar Yimeng Zhang says:

    Hi, I’m just curious about the way NVIDIA people benchmark K40 and K80 on Caffe… To my understanding, Caffe can only utilize single GPU, and K80 is just two K40 with lower clock speed. Then how can K80 outperform K40 in Caffe? It would be greatly helpful if you know this, thanks!

Leave a Reply

Your email address will not be published. Required fields are marked *