|
RAID algorithms have provided excellent reliability for storage systems over the last 20 years. However,
RAID algorithms work on the assumption that writes to disks are atomic and that when a write is issued
to a disk, it actually completes. Although write failures are rare, they do occur. In large storage
systems, they occur often enough to create a small, but non-trivial, probability of data corruption. Since
such data corruption is silent, it can be more dangerous than data loss, which would be reported by the
storage system.
Our Data Integrity research project is focused on developing techniques and algorithms to report and correct
silent-data-corruption events while maintaining system performance. In 2007 we are prototyping an initial
algorithm that might be included in several IBM Storage products.
IBM Almaden Research - Advanced Storage Systems
|