Method and system for reflecting differences between two files

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

06233589

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates generally to backup and synchronization of files, and in particular relates to a method and system for reflecting differences between two files.
BACKGROUND OF THE INVENTION
Copies of files are frequently transmitted over a network from one computer to another computer. One reason to copy a file is for backup purposes. If a file created on one computer has been backed up on another computer, it can be easily recovered in the event the hard drive of the first computer fails. Because the loss of a file could mean the loss of extremely important data, and/or result in significant effort to recreate the file, file backup processes are very common. However, file backup has at least two problems associated with it: first, it can require significant network bandwidth to transfer file data from the first computer to the backup computer, and second, it can require significant storage space to maintain copies of files. Both of these problems can be alleviated to some extent through the use of an incremental backup. An incremental backup copies only those files that have been changed since the previous backup. Incremental backups can significantly reduce the number of files that are backed up on a periodic basis.
Typically, when a file is modified, only a small portion of the file is actually changed from the previous version. While an incremental backup can reduce network bandwidth and save storage space compared to a complete backup, it is still inefficient in that a complete file is backed up even though it is possible that only a small portion of the file was actually modified. In an attempt to improve upon incremental backups, backup processes exist that identify the differences between two versions of a file, and attempt to backup only those differences. This is referred to as a differencing process. Differencing processes can reduce network bandwidth and storage requirements because only portions of the file are backed up.
Copies of files are also frequently made for purposes of synchronization or replication. A synchronized file exists in two different locations, such as on two different servers, and changes made to one file must be reflected in the other file. Synchronization usually occurs by periodically copying the file from one location to the other location.
U.S. Pat. No. 5,634,052 discloses a system for reducing storage requirements in a backup subsystem. The system includes creating a delta file reflecting the differences between a base file and a modified version of the base file, and transmitting the delta file to a server for backup purposes. One problem associated with this system is that the base file is necessary to create the delta file that reflects the differences between the base file and the revised file. Thus, if the delta file is to be created on another computer, such as the server, the base file must first be transmitted to the server where the differencing operation is carried out. Moreover, the '052 patent does not disclose optimal mechanisms for creating the delta file.
In a differencing backup system, the differencing mechanism used to create the delta file can be quite important. It is not uncommon for files to be many megabytes in size. A differencing mechanism that processes a file multiple times, or processes a file in an inefficient manner can result in excessive backup times. Moreover, an inefficient differencing mechanism can result in more data being backed up than necessary. In other words, two differencing mechanisms can vary in their ability to efficiently recognize and reflect differences between two files. Also, it would be preferable for a differencing mechanism to be able to determine differences between a base file and a modified version of the base file without actually having to repeatedly process the base file, so that the differencing operation can be performed on a remote computer, without the need to process the entire base file.
U.S. Pat. No. 5,574,906 discloses a system and method for reducing storage requirements in backup subsystems. The '906 patent discloses a system similar to that disclosed in the '052 patent, with the enhancement that the base file from which the differencing operation is derived can be compressed. In certain files, a compressed base file will utilize less bandwidth and less storage space on a computer than would an uncompressed based file. One problem with this approach is that the compressibility of files differs greatly. While compression can significantly reduce the size of some files, compression algorithms do not obtain significant reduction with other type of files. Additionally, the differencing mechanism of the '906 patent works by first compressing the revised version of the file, and upon determining that compressed portions of the base file and the revised file differ, both the base file and the revised file are uncompressed at those locations so that the differences between the two files can be determined. The overhead involved in such compression/decompression algorithms can be significant.
U.S. Pat. No. 5,479,654 discloses an apparatus and method for generating a set of representations of the changes made in a computer file during a period of time. The process disclosed in the '654 patent makes multiple passes through portions of the most recent version of the file to determine the differences between it and the previous version of the file.
Thus, it is apparent that a differencing system that reduces network traffic, efficiently determines and reflects differences between two files quickly, and reduces storage requirements would be highly desirable.
SUMMARY OF THE INVENTION
It is one object of the present invention to provide a method and system for determining the differences between a base file and a modified file without the need for a copy of the base file.
It is another object of the present invention to provide a method and system for reflecting the differences between two files that is highly efficient and reduces network traffic.
It is yet another object of the present invention to provide a method and system that can determine the differences between a base file and a modified file either on a client computer or a server computer without a need for a copy of the base file.
It is still another object of the present invention to provide a method and system for creating a signature of a base file that can be used to determine the differences between a base file and a modified file.
Additional objects, advantages and other novel features of the invention will be set forth in part in the description that follows and, in part, will become apparent to those skilled in the art upon examination of the invention. To achieve the foregoing and other objects, and in accordance with the purposes of the present invention as described above, a method and system for reflecting differences between two versions of a file is provided. The method includes generating, from a base file, a base signature file that includes a plurality of base bit patterns. Each bit pattern is generated as a function of a portion of data in the base file. A revised version of the base file is created. A revised signature file, including a plurality of revised bit patterns, is generated from the revised file. Each revised bit pattern matches at least one of the base bit patterns. Based on differences between the base signature file and the revised signature file, the revised file is accessed and a delta file reflecting the differences between the base file and the revised file is generated.
Once the base signature file is generated from the base file, the base file need not be accessed to generate the delta file reflecting the differences between the base file and the revised file. Thus, according to the present invention, the base signature file, which is a small fraction in size of the original base file, can be transmitted over a network to another computer, such as a server, and the differencing operation to generate the delta file can be carri

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and system for reflecting differences between two files does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and system for reflecting differences between two files, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for reflecting differences between two files will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2469653

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.