Apparatus and method for reconstructing a file from a...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C715S252000

Reexamination Certificate

active

06816872

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to techniques for representing file differences useful in computer file protect systems and other systems, and more particularly to file transfer techniques useful in an electronic data backup system wherein only changes in a file are periodically sent to the backup system and in other systems.
2. Discussion of Prior Information
It is well known to off-load computers at the end of a work day to secure the data file against computer failure. It is also known to transmit the file to an off-site location for additional file security.
What is not known is the generation of a set of representations of the changes in a file, and the periodic relocation of that set of representations and its use to update the previous version of the file.
SUMMARY OF THE INVENTION
Accordingly it is an object of the invention to generate a set of representations of the changes made in a computer file during a period of time.
Another object of the invention is to generate a set of representatives of the changes made in a computer file which can be used to update an earlier version of the file, or to create a previous version of an updated file.
Still another object of the invention is to generate and to use such a set of representations in a cost and time effective manner.
The objects of the invention are achieved through computer programs designed to run on a micro- and mini-computers. A first or SCAN program is designed to create a TOKEN Table (or file signature) of mathematical representations of segments of the file as it exists at the start of a period (earlier file (EF)). The TOKEN Table reflects the indices (ordinal numbers) for all of the segments in the earlier file, and the exclusive-or (XR) and cyclic redundancy check (CRC) products of the set of characters for each segment. Actually, two CRC products are generated for each segment; a sixteen bit one and a thirty-two bit one. The three products, XR and two CRC, are generated for speed in comparisons: the XR product is first compared because it is the fastest comparison; then the slower sixteen bit CRC one if necessary; and finally the still slower thirty-two bit CRC if necessary.
A second program is used at the end of the period to create a MATCH Table setting forth the location of segments in the current file that are identical to those in the earlier file. The MATCH Table lists the indices of all of the segments in the earlier file and the file offsets of the first character of the corresponding identical segment in the updated file. The second program calculates the mathematical representations of the first segment (window) in the updated, revised or current file, first calculating only the XR product and comparing it to the XR product for the first earlier-file segment in the TOKEN Table and noting whether a match exists. If so, it then calculates the sixteen bit CRC product and compares it to the sixteen bit early file CRC product and notes whether a match exists; if so, it finally calculates the more time consuming but more reliable thirty-two bit CRC product and compares it to the thirty-two early file CRC product and notes whether a match exists; and if so, makes an index and offset entry in the MATCH Table for the identical segments; the offset entry being the ordinal number of the first character in the current file segment string of characters. (The earlier file segments are numbered (indexed) sequentially.). If a segment match is obtained, the second program calculates one or more mathematical representations for the next segment in the current file, and compares them to the products associated with the next index in the TOKEN Table and representing the second segment of the earlier file. However, if a mismatch obtained, the window (which retains segment size) is bumped along one character, new product(s) calculated for the window characters and comparison(s) again made with the same representations of the earlier file segments in the TOKEN Table. This continues until a match obtains at which time the index for the earlier file segment and the offset of the first character in the nonmatching current file window (segment) are recorded in the MATCH Table. The process then continues as above to the end of the current file. Only the XR product is calculated in the event of an XR product mismatch; the sixteen bit and the thirty-two bit CRC products being generated respectively only in the event of earlier matches of the XR and sixteen bit CRC products.
A third program creates a TRANSITION Table that reflects what's in the current file that's not in the earlier table, and where. It scrolls through the list of indices and offsets in the MATCH Table, to see if each offset number differs from the previous one by the segment size. When such an offset differs from the previous one by more than the segment size, it adds the segment size to the first offset to determine the file ordinal number of the first character in the matching information, subtracts one from the second offset to determine the last character, goes to the current file and lifts therefrom that set of characters beginning with that ordinal number and stopping with the character preceding the extra-spaced offset, and adds them to the MATCH Table to create with the index a TRANSITION Table.
Thus creation of the Transition Table involves assuring that every character in current file is accounted for in the TRANSITION Table. The MATCH Table provides all of the information necessary for this accounting. Each entry in the beginning column represents a match in the early file of segment characters to the current file characters at location beginning. The matching segment in the early file is located at that offset, which is equal to the index times the segment size in early file.
Essentially the same process is followed for a deletion. The second program, if no match obtained for an earlier file segment by the end of the updated file (or over a predetermined number of segments as conditioned by the character of the file), would have proceeded to endeavor to match the next index mathematical representations in the TOKEN Table with a current file segment, with no offset entry having been made in the MATCH Table for the index of the segment that was unmatched. On proceeding with the index and representations of the next earlier-file segment, the window of the current file would be bumped along, and the index and offset number entered in the MATCH Table when the match of the mathematical representations occurred. The third program on scrolling through the MATCH Table offsets, notes the missing offset, notes the preceding offset, adds the segment size to the previous offset and copies from that number forward the reduced characters if any in the current file before the next offset, into the TRANSITION Table and in association with the index number of the unmatched segment.
The TRANSITION Table is used to update a copy of the earlier file. Typically, a fourth program and the earlier version of the file are on an off-site location and the TRANSITION Table representations are electronically transmitted thereto. The fourth program will examine the indexes and offsets of the TRANSITION Table, copying segments from the earlier file where the succeeding offset just differs by the segment size, into what is to be a duplicate version of the updated file, making additions where the offset numbers differ from the preceding ones by more than the segment size with the information provided in the TRANSITION Table, and substitutions from the TRANSITION Table where the offset numbers are missing.
As observed earlier, the TOKEN Table mathematical representations of file segments may be the products of exclusive-oring of the characters in successive earlier file segments and of generating two cyclic redundancy check (CRC) products for each earlier file segment. Corresponding XR products are most quickly generated, but do not detect character order differentiating; a sixteen bit CRC will catch most of these transpositio

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Apparatus and method for reconstructing a file from a... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Apparatus and method for reconstructing a file from a..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Apparatus and method for reconstructing a file from a... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3326457

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.