HPC
Times - January 2003
|
![]()
|
Microway's
Plans for 2003 In 2002, Microway's team met several challenging technical and business milestones including:
Products and services Microway will offer in 2003 include: 1)
64-bit solutions for HPC, based on Intel®
Itanium® 2 and
AMD Opteron™ 2)
Storage Solutions 3)
InfiniBand Connectivity 4)
Integration and Consulting Services
5)
Integrated Solutions powered by Platform Computing
products 6) Windows and Linux Based MPI Professional Products Microway will be integrating, MPI/Pro from MPI Software Technology. MPI/PRO is a commercial implementation of MPI and offers a significant improvement in performance over public domain versions of MPI. MPI/Pro is currently available for Linux and Windows® based clusters running Myrinet. It is expected that additional high bandwidth, low-latency interconnects will be supported soon. Watch this space for more information. New
Life Sciences Customers ALTANA Research Institute, the new technology-driven US research center of ALTANA Pharma, has purchased an 80-node dual Intel® Xeon™ cluster that will run BLAST for genomics and proteomics research. Microway's custom integration includes Platform LSF® to optimize the use of resources on the cluster. Ge Zhang of Altana commented: “We selected Microway because of their reputation for integrating high quality hardware and their ability to deliver the LSF software at an attractive price.” GlaxoSmithKline, a research-based pharmaceutical company, purchased 25 dual Xeon 2.8 GHz nodes. They will run in-house applications for numerical simulation and optimization of mathematical models for biological systems. Valeriu Damian of GlaxoSmithKline commented: “It was a pleasure working with Microway. They offer competitive prices and a wide selection of configuration options. We wanted a dedicated 'computational appliance' i.e., a fast parallel machine with lots of memory and high speed interconnection for CPU bound applications with a competitive price. Microway was one of very few companies that quoted us the system we needed without having to pay a premium for features that we did not care about.” |
![]()
|
BPROC/Linux BIOS Research at LANL One of the most interesting projects highlighted at SC 2002 was the development work being done at LANL to simplify the management of clusters. This is basically a throwback to the past. This article discusses the overall concept of BPROC and then discusses BIOS's in detail. It will be followed by another article on BPROC next month. Alien
File Servers At SC2002, the folks at Sandia (Ron Minich and crew) demonstrated a miniature machine composed of several single board computers, which would boot and start up an application in several seconds. Their demonstration used a combination of tools, including their Linux BIOS and BPROC (Beowulf Distributed Process Space). BPROC basically is a throwback to the AFS of Inmos, while the Linux BIOS is a very fast booting BIOS. For those of you unfamiliar with BIOS’s, I’ll discus this bit of code next. BIOS stands for Basic I/O System; it is very basic. The first one was written by PC pioneer Gary Kildall in the late 70’s for Intel 8080 development platforms. It was nothing more than another type of application called a monitor program. When your ancient Z80 or 8080 8-bit program crashed, it put you back into the monitor, from which you could probe memory to figure out what had happened. Kildall's BIOS was an afterthought. In 1975, he wrote a compiler for Intel called PLM-86, and as part of the project, he added a monitor, so you could run the code produced by the compiler. Intel customers liked the monitor code, and they included it with their early development systems. Eventually, Kildall discovered other uses for his monitor, and he marketed it as CP/M (which stands for Control Program and Monitor). The I/O routines in CP/M were extracted from it and built into a BIOS, so that it would be easy to port. Eventually IBM came along and bought a clone of CP/M called PC-DOS that was written by Seattle Computer and licensed to Microsoft. The BIOS came along as part of the deal. IBM published the sources to their BIOS as part of their new PC business model, which included making the IBM PC an open architecture machine. The BIOS of IBM was cloned by people like Compaq, Phoenix Technologies and AMI (to name a few). Every time you turn on your computer, the first thing it does is run the BIOS initialization and test cases, which turn on the devices on your motherboard and initialize your system's memory and busses. (If you don’t initialize memory at least once the parity and ECC systems will not be able to function). At that point in time, if you run MS-DOS, your machine will take advantage of the BIOS routines for any I/O done. If you run Linux or NT, the BIOS code will now be used to read a bootstrap loader into memory and then jump to it. Once the bootstrap loader is up and running, it has all of the I/O resources it needs to do things like load the image of the Linux Kernel. Next, your machine basically wipes clean the image of the BIOS, and starts to run using the device driver routines built into Linux. You might ask, why doesn’t Linux just use the BIOS routines? It turns out that at the time IBM wrote its BIOS, the peripheral chips used in the PC were buggy and IBM was forced to avoid using interrupts. Without Interrupts it's difficult to manage more than one task at a time. This meant that the original PC’s were single threaded, which was also the case for the BIOS and MS-DOS. Linux is a multi-tasking OS, in which the BIOS gets replaced by the kernel and its device drivers. The bottom line is that you don't need a BIOS in a PC, except for the first few seconds, and if you replace it with code that you have the sources for, you can do all sorts of interesting things. What Ron Minich of Sandia did was to use his BIOS (which is basically a few lines of assembly language along with routines extracted from the Linux kernel) to make it possible to boot a PC in a few seconds, instead of the minutes that it typically takes. With this accomplished, he was now ready to run his computer farm, using techniques similar to those employed by Inmos that made it possible to boot a Transputer farm in a few seconds. Next time we'll talk about BPROC, the code Sandia used to replace the Inmos AFS. |
![]()
|
Commercial
Compiler Options for Clusters Among the features available within most commercial compilers are OpenMP support (parallel threads), debuggers, Fortran 95 support, vectorization, GNU C library compatibility, and processor-specific optimizations (such as MMX, SSE, SSE2, 3DNow). Many have cluster packaging available for additional tools related to Beowulf clusters. Pricing varies. Intel
Portland
Group Inc. Absoft
Lahey
Computer Systems, Inc. Compaq
Customizing
Your Microway Cluster – How It Works Message
passing: Batch
Queuing, Scheduling, Load Balancing: Cluster
Management and Administration: Parallel
Virtual File System (PVFS):
parlweb.parl.clemson.edu/pvfs/ Upon receipt of a cluster order, Microway provides you with a “Cluster Questionnaire.” Completing the questionnaire provides Microway with information about your specific needs and requests for the software build. |
![]()
|
Commercial
vs. Open Source Applications – Tradeoffs The argument for Open PBS centers largely around the acquisition cost…it is zero. Open PBS can be customized by end users familiar with scripting languages. If the user does not have this expertise in-house, experienced developers and administrators are available in the market. However, once the user contracts services, then costs have been incurred. Even if in-house expertise exists there is a “soft cost” associated with their time. There are other considerations relating to growth and scale that might make the Open PBS route more expensive. Another issue exists with ongoing maintenance. Without the proper resources on-site, configuration and maintenance are difficult to accomplish. In critical cluster applications, predictable resolution of problems is essential. Companies like Platform stake their existence on providing reliable products with predictable, effective support. Support occurs in the public domain unpredictably. Inordinately long “down time” relates to higher opportunity costs or the lost value associated with a scientist unable to work. Platform LSF 5 provides scalability at the single cluster level. Platform also offers supported add-in products for special user needs. Just a few of them provide features such as:
LSF Multicluster or the Platform Globus Toolkits are the Platform products that provide scaling from a managed single cluster to a grid environment. These point offerings are all supported by a single vendor, again insuring that a migration is predictable from cost, technology and timing perspectives. If the trend at the installation is to build larger clusters or a pervasive computing environment that spans multiple clusters, then using Open PBS at the outset may complicate future growth or administrative requirements. Some features uniquely available in Platform LSF are job prioritization, the ability to add fault tolerance and recovery capabilities to the cluster. Numerous mechanical design and life sciences applications have hooks that allow integration into a cluster with LSF. If the user's anticipated needs are fairly stable, experienced technical personnel is available at little or no cost (opportunity cost or actual out-of-pocket expenses) and the desired functionality is basic/small scale then an Open PBS solution may make sense. However, if the user anticipates upgrading the system and needs to minimize the need for technical expertise, purchasing commercial software can result in the lowest total cost of ownership solution compared to the lifetime incremental cost of additional service and support.
|
![]()
|
What do you predict will be the most promising new technology in the HPC industry in 2003? Share your thoughts with us by emailing hpc@microway.com and we’ll post some of the answers in next month’s newsletter. Creating a community of questions and answers. |