Thanks to all our visitors at NEPCON/OEM NE and EHPC in Boston!

Our next event will be High Performance on Wall Street, Sept. 22, 2008.

MICROWAY
Microway to Integrate NVIDIA
Double Precision Tesla
HPC Times   June-July 2008

With the world's first teraflop many-core processor, NVIDIA TeslaTM based computing systems enable energy efficient, powerful parallel computing power. With 240 processor cores and a standard C compiler that simplifies application development, Tesla scales to solve your most important challenges. Tesla DP features include:


  • IEEE 754 Floating Point [single and double-precision] - meets the precision requirements of your most demanding applications with 64-bit ALUs
  • 1 TFLOP Single and 100 GFLOP Double Precision Floating Point throughput
  • 4 GB High-Speed Memory per processor (512-bit wide)
  • 102 GB/s transfer speed to local memory
  • System Monitoring Features
  • Remote capabilities and status lights on the front and rear insure easy administration.
  • Asynchronous Transfer - Turbo charges system performance by transferring data while the computing cores are busy
  • High-Speed, PCI-Express x16 2.0 Data Transfer

NVIDIA Tesla C1060
C1060 Computing Processor - 1 Tesla GPU and 4 GB Memory



NVIDIA Tesla S1070
S1070 1U Computing System - 4 Tesla GPUs and 16 GB Memory in a 1U chassis. Connects to host via PCI-E x16 2.0 or x8.


Pack more processors into 1U
with double density nodes!

Microway is offering a new platform for our customers:
Two dual-socket nodes in 1U - saves space, energy and cost.

Microway 1U Twin

Analysis of the TOP500 Cluster list reveals that clusters with InfiniBand interconnects have 60% higher performance/watt than gigabit ethernet clusters. Microway's Navion-2X and Numbersmasher-2X feature low-cost, integrated ConnectX DDR InfiniBand. Ask about two AMD OpteronTM or Intel Xeon® nodes in 1U when planning your next cluster!

Large Shared Memory Systems

For the last decade, distributed memory parallel programming has become increasingly popular as the floating point throughput of PCs has increased dramatically. Providing coherent access to large-scale shared memory has proven to be a software challenge. Users have faced a choice: to rewrite their codes using message passing (MPI), managing coherency by using a shared-nothing architecture, or to run their codes on large, expensive symmetric multiprocessing (SMP) machines with a single large memory space.

For large memory applications, when it becomes possible to store all of the data that a problem needs to have access to in memory in a coherent manner, then the shared memory model SMP machine provides the best performance.

Now, Microway is partnering with ScaleMP to offer modular, scalable large memory SMP machines. Your SMP applications will be able to use use up to 32 processors (128 cores) and 1TB shared memory connected via a low latency InfiniBand fabric.

  • Large memory resource, which enables larger workloads that connot be run otherwise, and offers an alternative to costly and proprietary RISC systems
  • Shared memory coupled with high-core count, which allows threaded applications to scale
  • Shared I/O space presents all I/O devices as directly connected
  • Scale processor, memory and I/O resources as your needs grow
  • Ease of use with a single server and operating system to manage

If you have a problem which uses multiple threads or needs access to a large single addressable memory image, you owe it to yourself to evaluate a Microway Large Memory Machine enabled by ScaleMP. (Click here)

How good is your MPI?
Try MPI Link-CheckerTM and See

MPI Link-Checker

Is your interconnect performing optimally?
Get a free report on your cluster's MPI performance

Please call:
Bruce Schulman at 508-732-5520
Eliot Eshelman at 508-732-5534
for a quote on your next project.
Visit us online at www.microway.com
WWW.MICROWAY.COM