IBM   Almaden Computer Science About Almaden Computer Science Press Careers Home
CS home
  About us
  Ease of use
  Delta Compression
  LH* Schemes
  RAID controllers
  Serverless Backup
search the web for java
patent server
Delta Compression
In-Place Reconstruction for Internet Software Distribution

Delta compression - can it save you time, space or money?

In this project, we study delta compression, the algorithmic task of finding common strings between versions of data and using the commonality between versions to compactly encode one version by describing it as a set of changes from its companion. Previous differencing algorithms run in O(n^2) time and O(n) space.

We have developed a new algorithm that runs in O(n) time and O(1) space that closely approximates the compression performance of the previous algorithms. We have also developed two additional algorithms based on heuristics which improve the compression performance of the O(n) algorithm and the runtime performance of the O(n^2) algorithm.

Our treatment includes theoretical results which limit the compression power of algorithms under natural performance assumptions. All our algorithms for differencing operate at a fine granularity and make no assumptions about the format or alignment of input data. We have also explored their performance properties experimentally.

We have also developed an algorithm for modifying delta compressed files so that the compressed versions may be reconstructed without scratch space. This allows network clients with limited resources to efficiently update software by retrieving delta compressed versions over a network.

Delta compression for binary files, compactly encoding a version of data with only the changed bytes from a previous version, may be used to efficiently distribute software over low bandwidth channels, such as the Internet. Traditional methods for rebuilding these delta files require memory or storage space on the target machine for both the old and new version of the file to be reconstructed.

With the advent of network computing and Internet-enabled devices, many of these network attached target machines have limited additional scratch space. We have developed an algorithm for modifying a delta compressed version file so that it may rebuild the new file version in the space that the current version occupies.


Almaden Home | IBM Research | Legal | Feedback