IBM
Skip to main content
 
Search IBM Research
     Home  |  Products & services  |  Support & downloads  |  My account
[an error occurred while processing this directive]
 Select a country
 IBM Research Home
Almaden Research
About Us
Visitor Information
Almaden Projects
IBM Research News
Career Opportunities
Feedback

 
 


IBM Research - Almaden
Delta Compression

Past Project

In this project, we study delta compression, the algorithmic task of finding common strings between versions of data and using the commonality between versions to compactly encode one version by describing it as a set of changes from its companion. Previous differencing algorithms run in O(n^2) time and O(n) space.

We have developed a new algorithm that runs in O(n) time and O(1) space that closely approximates the compression performance of the previous algorithms. We have also developed two additional algorithms based on heuristics which improve the compression performance of the O(n) algorithm and the runtime performance of the O(n^2) algorithm.

Our treatment includes theoretical results that limit the compression power of algorithms under natural performance assumptions. All of our algorithms for differencing operate at a fine granularity and make no assumptions about the format or alignment of input data. We have also explored their performance properties experimentally.

In addition, we have developed an algorithm for modifying delta compressed files so that the compressed versions may be reconstructed without scratch space. This allows network clients with limited resources to efficiently update software by retrieving delta compressed versions over a network.

Delta compression for binary files (compactly encoding a version of data with only the changed bytes from a previous version) may be used to efficiently distribute software over low bandwidth channels, such as the Internet. Traditional methods for rebuilding these delta files require memory or storage space on the target machine for both the old and new versions of the file to be reconstructed.

With the advent of network computing and Internet-enabled devices, many of these network attached target machines have limited additional scratch space. We have developed an algorithm for modifying a delta compressed version file so that it may rebuild the new file version in the space that the current version occupies.

   
Link to content List of Almaden Projects

Technical Papers

R. C. Burns and D. D. E. Long
Link to content in pdf format In-place Reconstruction of Delta Compressed Files
In Proceedings of the 1998 Conference on the Principles of Distributed Computing (PODC), ACM, 1998.

(Acrobat PDF, 90.6 KB)

R. C. Burns and D. D. E. Long
Link to content in pdf format Efficient Distributed Backup and Restore with Delta Compression
In Proceedings of the 1997 Workshop on I/O in Parallel and Distributed Systems (IOPADS), ACM, 1997.

(Acrobat PDF, 91.5 MB)

R. C. Burns and D. D. E. Long
Link to content in pdf format A Linear Time, Constant Space Differencing Algorithm
In Proceedings of the 1997 International Performance, Computing and Communications Conference (IPCCC), IEEE, 1997.

(Acrobat PDF, 84.3 KB)

R. C. Burns
Link to content in pdf format Differential Compression: A Generalized Solution for Binary Files
A thesis in completion of the Master's of Science degree, Department of Computer Science, University of California, Santa Cruz, December 1997.

(Acrobat PDF, 190 KB)

  About IBM  |  Privacy  |  Terms of use  |  Contact