Celebrate Summer with New Coprocessor Options and Improved GPU Tools
Summer has arrived bringing new technologies that improve your productivity. We're introducing new solutions built upon the Intel Xeon Phi coprocessor (based on the MIC architecture). Our blog features a series of posts for getting the best performance on your coprocessor or GPU. NVIDIA is preparing an improved version of CUDA 5. Look out for a new Microway.com in July.
Intel Fills out the Xeon Phi Product Line
Along with the announcement that Xeon Phi powers the most powerful supercomputer in the world,
Tianhe-2, Intel has released the next models in the Phi product line. These provide a richer set of options for those looking to get started with coprocessors (Xeon Phi 3120A, actively-cooled) and those looking for the highest performance (Xeon Phi 7120P).
|Xeon Phi 3120P
|Xeon Phi 3120A
|Xeon Phi 5110P
|Xeon Phi 7120P
Xeon Phi coprocessors are now available in Microway's full line of products, including quiet WhisperStations and rackmount NumberSmasher servers. All are available with parallel compilers/analyzers and accelerated math libraries preconfigured.
New or Updated Xeon Phi Programming and Educational Resources
Microway HPC Tech Tip
Parallel Code: Maximizing Your Performance Potential
Abstract: No matter what the purpose of your application, one thing is certain. You want to get the most bang for your buck. You see research papers being published and presented making claims of tremendous speed increases by running algorithms on GPUs (e.g., NVIDIA Tesla), in a cluster, or on a hardware accelerator (such as the Xeon Phi or Cell BE). These architectures allow for massively parallel execution of code that, if done properly, can yield lofty performance gains.
Unlike most aspects of programming, the actual writing of the programs is (relatively) simple. Most hardware accelerators support (or are very similar to) C based programming languages. This makes hitting the ground running with parallel coding an actually doable task. While mastering the development of massively parallel code is an entirely different matter, with a basic understanding of the principles behind efficient, parallel code, one can obtain substantial performance increases compared to traditional programming and serial execution of the same algorithms.
New Partnerships with Panasas and Revolution Analytics
We are excited to announce Microway's partnership with Panasas for high-performance parallel storage. Panasas ActiveStor
is the world's fastest parallel storage system, bringing plug-and-play simplicity to large scale storage deployments. ActiveStor offers the performance of parallel storage without the headaches commonly associated with such systems.
Microway has also entered into a partnership with Revolution Analytics. Researchers throughout the world use R, but many require more performance than is available from the open-source package. Revolution R
provides significant performance and scalability features, without sacrificing compatibility with the community-developed packages researchers rely upon.
Further details will be announced soon.
NVIDIA Announces CUDA 5.5 Release-Candidate
NVIDIA has announced that CUDA 5.5 is ready for testing, with a production release to follow. Features of note include:
Optimized For MPI Applications
- Enhanced Hyper-Q support for multiple MPI processes via the new Multi-Process Service (MPS) on Linux systems
- MPI Workload Prioritization enabled by CUDA stream prioritization
- Multi-process MPI debugging and profiling
Guided Performance Analysis
- Step-by-step guidance helps you identify performance bottlenecks and apply optimizations in the NVIDIA Visual Profile and Nsight, Eclipse Edition
Support For ARM Platforms
- Native compilation, for easy application porting
- Fast cross-compile on x86 for large applications
Fast CUDA-Python is here with NumbaPro
Python has a massive user base and robust community. If you've been waiting for fast CUDA-Python support, Continuum Analytics is delivering NumbaPro
in partnership with NVIDIA. NumbaPro is part of the Anaconda Accelerate
library and features:
- GPU targeting with single line vectorization commands
- Robust just in time compiler for CUDA GPUs targeted towards more complicated codes
- Support for the new high-level Compute Unit (CU) abstraction (Experimental)
- Optional CUDA-based API for custom management of threads and blocks
New CUDA Handbook Available
This is a huge 500+ page volume for CUDA GPU Programmers. Written by one of the original architects of CUDA, it includes critical information needed to improve your CUDA code performance and provides advanced techniques:
- Detailed discussion of programming for the Kepler GPU architecture and CUDA 5 features
- Discussion of host hardware architecture (CPU), PCI-E structure, NUMA/SMP and their effects on performance
- Detailed tips for programming multiple GPUs
- Microbenchmarks for memory bandwidth and performance
- New demo code optimized for Kepler GPU architecture
- Information on pre-ported libraries
Coming Soon: A New Microway.com!
Be on the lookout for a new microway.com in the coming weeks. We've been working hard on this for over 4 months, and we hope it will provide clearer navigation, clean new aesthetics, and a brand-new Knowledge Base packed with the latest technical information.
Opt Out of Future Email