MMDS: Microway MPI Diagnostic Suite
Microway's MPI Diagnostic Suite consists of two applications: MPI
Link-Checker and MPI Fast-Check. The two applications
perform complimentary functions. MPI Fast-Check performs a fast one-time
test of all nodes in your cluster and reports any nodes that are clearly
underperforming. MPI Link-Checker performs a much more extensive set of
latency, bandwidth, and data integrity tests between all pairs of nodes.
MPI Link-Checker measures the bandwidth and latency between all nodes in a cluster then summarizes the data graphically so you
can easily spot problem nodes. The MPI Link-Checker tool detects issues with processor caching,
motherboards, PCI busses, BIOS's, riser cards and PCI interconnects. It can even detect intermittent cables
and cross bar switches!
MPI Link-Checker initiates an MPI
application on each node and then uses it to collect data. It then displays a pair of screen plots
that show the latency and bandwidth between all pairs of nodes in the system. It has a number of features
that make it easy to discover interesting data about a cluster. For example, if a particular node has some
problem with its PCI bus that is impacting the performance of that node, the problem might manifest itself
as a small reduction in bandwidth or increase in latency. Problems like this are very difficult to spot
unless you have a tool like this.
MPI Link-Checker makes it possible to identify a node that is under-performing by using different
statistical methods to accumulate data. The problem node shows up as a large dark cross on the image.
If there is more than one such node, a number of crosses will appear. The tool picks up gross problems,
such as communications links that are not working or that are experiencing data errors, and subtle problems
- links that appear to be working, but are not working reliably. At the end of the day, the tool doesn't
tell you precisely what the cause of the problem is, just where the problem is, along with the fact that
In Microway's production environment, MPI Link-Checker picks up issues with clusters that have been
burned in and validated about 50% of the time. Many of these issues are simple ones that do not manifest
themselves by simply running MPI validation suites. For example, all that it takes to slow up a PCI bus
is for a BIOS to incorrectly set up a particular slot or for a manufacturer to ship a riser that works
at 66 MHz but not at 133 MHz. These minor mishaps occur all the time all the time and are very difficult
to spot. Often all it takes to slow down an entire cluster running an MPI "fork and join" style application
is for a single node to run slower than the rest. This is also the reason why veteran cluster users don't
add new nodes to old clusters: the resulting cluster runs only as fast as the slowest nodes.
MPI Link-Checker can also be used to gather interesting statistics about your cluster that will help you
to design algorithms that run efficiently on it. Self-connected regions of a cluster show up on the screen
in a uniform color. Regions that are disconnected from other regions by connectivity barriers also are easy
to spot. These barriers are a result of the fact that in many clusters, nearby nodes are connected by single
or double hops through the switch. As a rule, the latency between two nodes increases with the number of hops
through which they are connected. Regions connected by a single hop will show up lighter on the screen than
regions connected by two hops. While the tool colors the screen with different colored boxes to represent
the observed latency and bandwidths, it can also be set up to display the observed numbers on the graphic,
and also to show detailed data for any connection on the screen below the graphic. MPI Link-Checker does this
in real time. On large clusters, it takes a while with the current version to collect the n-squared pieces of
information, and the representation loses detail. However, our tool makes it possible to drill down into
any section of a cluster and recover the information for specific nodes or specific regions.
Fast-Check provides a simple text based check of the cluster. When running on
very large clusters, the standard Link-Checker may take minutes per pass.
Fast-Check performs fewer total connections yet still tests every node. In as
little as 5 to 10 seconds, you can get bandwidth and latency information as
well as alerts to any problematic nodes.