NVIDIA held its GPU Technology Conference in San Jose. In addition to being an opportunity for us to meet up with our customers (and soon-to-be customers), it was an impressive display of GPU-infused technology for developers and computational scientists.
This year, the conference highlighted the revolution in deep learning, and how it can affect nearly every aspect of computing. The Keynote presentation by NVIDIA CEO Jen-Hsun Huang showcased recent developments in deep learning and how the latest GPUs advance this research. GTC 2015 was NVIDIA's largest ever, and it included over 35 sponsors, over 130 exhibitors, and more than 500 Sessions in 30 categories.
Fortunately, for those who missed the show, NVIDIA has posted most of these sessions, including the keynote presentations.
NVIDIA has launched a new NVIDIA Partner Network (NPN) program, and Microway has achieved Elite Solution Provider status. The Elite designation recognizes our achievement in the Accelerated Computing Competency, which includes technical expertise; customer service; and the ability to design, implement, and maintain best-in-class accelerated computing solutions from NVIDIA.
Tesla K80 was on display throughout the GTC show floor. We saw new Tesla K80 platforms and compatibility announcements from platform partners such as Supermicro, Asus, Tyan, Magma, and Cubix. The Microway booth featured our Octoputer 4U 8 GPU Server with 8 Tesla K80s: nearly 40,000 CUDA cores and 70 TFLOPS in 3U.
In case you missed out the basics of Tesla K80 from earlier events.
In his keynote, Jen-Hsun detailed the new GTX Titan X, billed as "The world's fastest GPU". Hinted at during GDC2015, GTX Titan X was revealed to have: 8 billion transistors, 3,072 CUDA Cores, 12GB memory, and 7 TFLOPS (SP).
Double Precision floating performance is only 1/32 SP: 200 GFLOPS. While well suited for deep learning researchers and other single-precision bound applications, GTX Titan X clearly isn't for everyone. The Titan X is priced slightly over $1000, and according to Huang, "the card should pay for itself in a day" when deployed for machine learning algorithms.
Supermicro® announced a new 1U 4x GPU SuperServer® platform at the conference, further pushing forward GPU compute density. The new SYS-1028GQ-TRT supports up to 4 NVIDIA Tesla K80 dual-GPU accelerators. This system will become our new NumberSmasher 1U Tesla GPU Server (4 GPUs).
The server also supports dual Intel Xeon E5-2600 v3 processors, up to 1TB DDR4 memory in 16 DIMM slots, 2x 2.5" hot-swap SATA drives, and includes dual 10GBase-T ports. It is powered by intelligent, cold redundant 2000W (1+1) Titanium Level high-efficiency power supplies. All of this in a 1U package!
To give deep learning research developers an opportunity to put the Titan X through its paces, Huang announced their DIGITS (Deep Learning GPU Training System) DevBox platform. This direct-from-NVIDIA $15k box boasts 4 Titan X boards, and includes their multi-GPU training and interface.
Jen-Hsun stressed that this is a box for developers, not for end-users, as they don't expect to sell a lot of these, but rather to have them bootstrap further use of DIGITS. He described the DIGITS DevBox as providing "as much compute performance as you can get out of a single electrical outlet".
We're making our own WhisperStation for Deep Learning! Heavily influenced by DIGITS Devbox, the WhisperStation–Deep Learning will also feature 4 GTX Titan X and DIGITS. Look out for this system, at an affordable price, in the near future.
Looking ahead, CEO Jen-Hsun Huang projected the future of NVIDIA's GPU evolution. Now that we've seen Tesla, Fermi, Kepler and Maxwell architectures, the keynote expounded upon Pascal, for sometime in 2016, and Volta beyond that.
Pascal will be the first NVIDIA GPU with 3D memory and NVLink. It will also include mixed-precision mode, and it was claimed that it could provide up to 4X the throughput in mixed-precision workloads.
While deep learning programs don't need high levels of floating-point precision, they do tend to be limited by memory bandwidth. These improvements, with the addition of the NVLink GPU interconnect might offer anywhere from five to ten times the performance of Maxwell GPUs in deep-learning tasks.
Another special guest was Elon Musk of Tesla Motors and SpaceX. Jen-Hsun and Elon discussed the advent of self-driving cars, and the technology behind them. NVIDIA's DRIVE PX self-driving car platform debuted, powered by 2 NVIDIA Tegra X1s and including robust sensor and other functionality, will available in May for $10k to vehicle developers. Tegra X1-based platforms for other embedded applications, at entry-level price points, are likely to follow.
Jen-Hsun and Elon also discussed the pros and cons of AI and how well it could substitute for a driver's human judgment. Current technology, while impressive, seems to be limited to around 5-10 miles per hour. Government regulations notwithstanding, it appears that deep learning will be the key to the reality of autonomous vehicles in our future.
As a Summit-within-a-Conference, the OpenPOWER Foundation hosted its first annual OpenPOWER Summit at GTC. During the three-day event there was also a separate exhibitor pavilion area for OpenPOWER members to demonstrate their latest advancements. If you are not familiar with this initiative, it basically revolves around the ongoing evolution of IBM's Power-based platform. Tyan will be Microway's first platform partner to deliver a commercially available 3rd party OpenPOWER server in the second quarter. It's a 2U, single-socket system. This and other servers are initially aimed at large scale customers such as internet service and cloud providers.
IBM says the OpenPOWER Foundation now has 113 members, up from 30 just nine months ago. Looking at the long-term, IBM is also licensing the Power architecture itself, in a model similar to that used by ARM, allowing other companies to build derivatives of the Power chip.
The ecosystem is still young, but it rides on many years of IBM CPU, hardware design, and software innovation. Server selection is still limited, and Ubuntu 14.04.2 LTS and 15.04 are the supported OSes for now.
We're looking at bringing OpenPOWER based systems into the Microway product line. If you think such systems are a match to your needs, we'd love to hear from you:
Love knowing the low-level pieces of a CPU architecture? These are for you:
At Microway we design a lot of GPU computing systems. One of the strengths of GPU-compute is the flexibility PCI-Express bus. PCI-Express was first introduced over ten years ago but is still often misunderstood by many. Assuming the server has appropriate power and thermals, it enables us to attach GPUs with no special interface modifications.
We address some of the more common misconceptions regarding the standard, involving such areas as CPU architecture, device-to-device transfers, and system design. Intrigued by PCI-E?
Eliot Eshelman at 508-732-5534
Brett Newman at 508-732-5542
Ed Hinkel at 508-732-5523
GSA Schedule GS-35F-0431N