|
OptimalGrid is a project in the
Healthcare Information Management
department at the
IBM Almaden Research Center
that is designed to solve the next generation of large scale parallel
problems on a large number of network-attached, heterogeneous compute
nodes (i.e. "The Grid"). The first generation of large parallel
problems were
Unconnected problems,
which are sometimes referred to
as "independently parallel" or even "embarassingly parallel" problems,
as they are relatively straight-forward to solve in a grid environment.
Unconnected problems, such as
SETI@Home,
can be split up into arbitrary pieces and computed independently.
Issues of job management, problem piece deployment, and system load
balancing, are all either solved individually (per node), or not
addressed at all.
Connected problems,
on the other hand, require much more complex management in virtually
every area: Problem definition, problem partitioning, problem piece
deployment, problem piece repartitioning/redeployment, compute node
management and overall system orchestration. If any compute
node in a connected problem computation slows down or fails, the entire
computation slows or stops. Thus, the problem management component
must be able to address all failures or problems in the compute
landscape.
OptimalGrid automates most aspects of solving a large scale connected
problem on a computing grid, thus freeing a scientist (or problem owner)
to concentrate on the problem at hand
(e.g.
finite difference time domain solutions,
or perhaps find a missing baryon). With OptimalGrid,
the problem owner does not have to concern herself with the partitioning
of the problem, its deployment, the enlisting of the compute nodes,
the delivery of the code for the various parts of the distributed
computation, the runtime management of the overall problem, the
dynamic rebalancing (e.g. repartitioning and reapportioning of problem
pieces, as well as dynamic node replacement) and the employment of
additional utilities for report generation, visualization and data
aggregation.
|