Tesla K80 Arrives + SC14 Highlights

Microway Unveils NVIDIA's Latest Tesla GPU: K80

Microway is proud to announce that NVIDIA's newest and fastest GPU, the Tesla K80, is now available in our fully-integrated servers and clusters. This new GPU provides unprecedented performance, thanks to:

1U, 2U, and 4U chassis, in addition to non-quietized workstations, are all available and fully qualified by NVIDIA to support the K80. With room for up to eight Tesla K80s in a 4U footprint, it's possible to fit sixteen GPU chips (just under 40,000 CUDA cores) in a single server. For more information, be sure to read our Tesla K80 blog post or contact us with any questions. We also have K80s available on our benchmark cluster, so sign up to take a test drive!

SC14 Highlights

Thanks to all of our customers and partners who visited Microway's booth at SC14 in New Orleans. For those who weren't able to attend, here is a quick recap of some exciting new technology from the conference.

NVLINK Detailed: We've already covered NVIDIA's K80 announcement. NVIDIA also provided more information on its forthcoming (anticipated for 2016) GPU interconnect, NVLINK. With bandwidth 5x greater than PCIe 3.0, NVLINK enables much faster GPU-to-GPU and GPU-to-CPU connectivity than what is currently available. NVIDIA presents more information on their landing page.

More Omni-Path Details: Intel unveiled additional information on it's newest interconnect, Omni-Path (previously known as Omni Scale). Of particular note is the 48-port switch architecture, which means larger clusters will both save on switch counts and enjoy reduced latency. Expect availability in the second half of 2015.

OpenPOWER Foundation Grows Substantially, Announces Massive System Deployments: The next major cluster deployments at 2 labs (LLNL, ORNL) will be powered by POWER CPUs, NVIDIA GPUs, and Mellanox EDR InfiniBand. These clusters look remarkable due to a very "flat" design: POWER CPUs with strong serial performance, connected to massively parallel NVIDIA GPUs via high-bandwidth NVLink and sharing a unified memory space, and a dual-rail (2X) 100Gb EDR IB interconnect between the nodes. Software builds upon the existing POWER ecosystem, includes LLVM for POWER CPUs, and the new Clang frontend. These are likely to be Top 5 systems when deployed. Moreover, a very young ecosystem is emerging for OpenPOWER hardware infrastructure from non-IBM vendors. Look out for growth in 2015 and 2016.

New Knights Landing Specifics: In addition to Omni-Path specifics, Intel also released more information on the next generation Xeon Phi architecture, Knights Landing. Details include performance numbers (3+ TFLOPS in a single package) and the amount of on-package memory (16GB initially). Like the regular Xeon processor, the socketed Xeon Phi will also have access to system memory as well.

Mellanox 100Gb/s InfiniBand: The final interconnect on this list, Mellanox's new 100Gb/s EDR technology was on display as well. In addition to its low, 90 nanosecond latency, EDR will be the first next-generation 100G interconnect available, with shipping expected in early 2015.

ARM for Compute Grows: News of long promised 64-bit ARM platforms was more plentiful at SC14. For the first time we saw motherboards on the show floor with 64-bit ARM CPUs. Products are moving beyond development kits in 2015. However, IO (ex: PCI-E connectivity) is still somewhat limited on these hosts and overall throughput is still modest. Applied Micro was particularly active, and AMD is entering this space with Opteron A-Series Processors

ArrayFire Now Open Source: ArrayFire's announcement that its software is now available open source and for free is great news for GPU programmers interested in this very useful library. They can now more easily take advantage of ArrayFire's many functions including signal and image processing, statistics, and other mathematical calculations.

Software Defined Networking continues to grow: Specifically, we saw Quanta showcase their own SDN ready network switches. Much as software defined storage has provided more options for the HPC industry, software defined networking provides administrators with additional control and the ability to further customize their networks. We've also noticed Intel 10GigE switches with OpenFlow deployed in blade chassis.

All together it was an exciting show, although we're going to have to wait a little longer to get our hands on many of these new technologies.

Microway's QuietQuad WhisperStation

Microway's QuietQuad Xeon WhisperStation turned a lot of heads and was featured on the Intel Fellow Traveler Tour at SC14. Visitors to our booth didn't know whether to be more impressed by the 1TB of memory and 48 cores or the near-silence of the system. The QuietQuad is a unique product that's specially engineered by Microway for quiet operation. It offers levels of compute power, quantities of memory, and storage capacity normally available only in loud, rackmount servers.

The software running on our QuietQuad was an R demo, showing how much faster the system performed compared to a standard R configuration when 48 cores and Intel Software Tools were deployed. With speedups ranging between 30x-200x, the QuietQuad's superior compute power and optimized software were able to deliver a truly remarkable performance. If you're interested in learning more about the QuietQuad or the WhisperStation R, please contact one of our experts.

AMBER and NAMD Benchmarking

Those of you who read our blog have noticed the flurry of GPU-accelerated molecular dynamics posts. With useful information on how to benchmark your code, as well as instructions on how to run and manage your jobs on a cluster, these posts are a helpful resource for molecular dynamics customers.

Our first post on the AMBER package of molecular simulation programs has step-by-step instructions on how to quickly and easily log in to a cluster, run your job, and compare the results. Our default benchmark suite demonstrates how Tesla K40s yield an improvement of up to 87x compared to the CPUs alone.

Similarly, our post on the molecular dynamics package NAMD is a great guide for those new to running their code on GPU clusters.

Tech Tidbits

We've collected a number of the latest HPC technical resources for your perusal:

Upcoming GPU Webinars

HPC BlogKnowledge Center

Accelerate Your Research!

Take advantage of our expertise: we can recommend the most effective compute solutions, help you plan a painless installation, and deliver fully-tested/fully-integrated clusters, servers and WhisperStations.

Microway logo

Microway on Facebook

Eliot Eshelman at 508-732-5534
Brett Newman at 508-732-5542
Jan Smith at 508-732-5523
GSA Schedule GS-35F-0431N
Opt Out of Future Email