|
General Parallel File System
(GPFS)
is a scalable, parallel,cluster file product that originated as Almaden's
Tiger Shark file system. It now supports
IBM® Blue Gene® and
IBM®
™
Cluster systems, including the Linux
(Cluster 1350) and
the AIX ( Cluster 1600)
systems.Tiger Shark was originally developed for large-scale multimedia, but in its GPFS incarnation
has been extended to support the additional requirements of parallel computing. GPFS supports single
cluster file systems of multiple petabytes and has run at I/O rates of more than 100 gigabytes per
second. It has recently evolved to be used in multi-cluster grid systems with high-bandwidth access
to data from multiple storage clusters across wide geographic areas.
GPFS is the file system for the
ASC Purple Supercomputer. ASC (the Advanced Simulation and Computing program) is a Department
of Energy initiative to use computer simulation rather than nuclear testing to ensure the safety,
reliability and performance of the nuclear stockpile. This requires computational, storage and I/O
capabilities far beyond what existed before. ASC Purple is the current generation computing platform
at Lawrence Livermore featuring 12,000 processors, a data store of 2 petabytes and I/O rates over
130 GB/sec to a single file or multiple files.
Recently, the scope of GPFS was extended to include the new Blue Gene machines. In this environment,
a GPFS provides high-bandwidth I/O to the Blue Gene compute nodes using daemons that relay such requests
to the designated I/O nodes. The I/O nodes form a GPFS cluster that communicates in parallel with another
(typically Linux) cluster outside the Blue Gene machine. The external cluster actually has the physical
connections to the disk volumes and operates as remote disk servers to the cluster within the Blue Gene.
These systems can be thousands of I/O nodes and 10s of thousands of compute nodes.
In addition to high-speed parallel file access, GPFS provides fault tolerance, including automatic recovery
from disk and node failures. Its robust design and multi-node access have made GPFS the chosen file system
for a number of commercial applications such as large Web servers, data mining, digital libraries, file servers
and online data bases.
IBM Almaden Research - File Systems
|