Data processing: software development – installation – and managem – Software program development tool – Code generation
Reexamination Certificate
2000-04-28
2002-12-17
Dam, Tuan Q. (Department: 2124)
Data processing: software development, installation, and managem
Software program development tool
Code generation
C717S122000, C717S169000, C707S793000, C710S002000, C341S051000
Reexamination Certificate
active
06496974
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates to the updating of computer software by use of patch files, the generation of such patch files, and the normalization of files to which such patches are applied.
BACKGROUND AND SUMMARY OF THE INVENTION
Popular computer programs, including computer operating system software, are subject to near-constant revision. Their evolution is sometimes so rapid that, a month after installation, a newer version is available. The newer version may feature additional capabilities, bug fixes, and enhanced compatability with other programs. Accordingly, many users desire to update their programs each time a newer version is released.
A user wishing to update a computer program can either acquire a new copy of the program, or “patch” the old. Patching is growing in popularity, particularly with the growth of the internet. Patches for updating many popular computer programs are now commonly available from software vendor's web sites, allowing users to update their software programs without leaving home.
Patching is an old technology, going back decades. Generally, patch files include a series of instructions specifying how a new version of a file can be assembled from snippets of data from an old version of the file, together with insertions of new data. An exemplary series of patching instructions may look like the following:
1. Load old file ABC.EXE into memory;
2. Check that the file data at offset
16
reads “Version 2.04”; if not, fail;
3. Copy bytes
1
through
16
of the old file into a new file;
4. Next, insert the ASCII text “Version 3.02” into the new file;
5. Next, copy bytes
22
-
256
of the old file into the new file;
6. Next, insert the following hex data into the new file:
09
03
00
01
60
6B
F5
D5
3B
59
1A
10
B5
69
08
00
7. Next, copy bytes
289
-
496
of the old file into the new file;
8. Next, copy bytes
505
-
512
into the new file;
It will be recognized that the foregoing instructions result in a new version of file ABC.EXE in which:
the first
16
bytes are unchanged;
the version number stored at bytes
17
-
28
has been rewritten from “Version 2.04” to “Version 3.02”
bytes
22
-
256
are unchanged;
32 bytes of hex data at bytes
257
-
288
have been rewritten;
bytes
289
-
496
are unchanged;
bytes
497
-
504
have been omitted; and bytes
505
-
512
have been shifted to immediately follow byte
496
.
Due to the replication of long strings of data from the old file in the new file, the series of patching instructions is much shorter than the file being patched. This size economy is the reason patching is more popular than transferring an entire copy of the new file.
The process of generating patching instructions, like those reproduced above, is typically automated. The vendor inputs copies of the new and old program file to a pattern matching algorithm, which tries to locate where strings of data in the new file can be found in the old file. Where such matches are found, appropriate copy instructions are generated and added to the collection of instructions that will form the patch file. Where data in the new file has no counterpart in the old, the new data is literally specified in the patch file. When completed, the patch file—in conjunction with the old version of the file—contains all the information necessary to generate the new version of the file.
After the patching instructions have been specified in a patch file, the file is typically compressed to minimize its size and download time (assuming an internet or other network download). Many suitable compression processes are known. Various implementations of the popular LZ compression algorithms typically reduce file sizes on the order of 50%.
After the patch file is compressed, it is transferred from the vendor's computer to the user's computer—by internet in this example. On the user's computer a decompression process is first performed to restore the patching instructions to their original form. Then the various operations specified by the patching instructions are performed, transforming a copy of the user's old file into the latest version.
While the just-described process is a great improvement over transferring a new copy of the complete program file from the vendor to the user, it still suffers from certain drawbacks.
One is the size of the compressed patch file. As discussed below, patch file sizes considerably smaller than those resulting from prior art processes are possible, reducing internet download times (or reducing needed disk storage) commensurately.
Another problem is that the version of the old file on the user's computer may not precisely match the version distributed by the vendor. In particular, the file may have been tailored in certain respects—at the time of installation on the user's computer—to better conform to particular characteristics of the user's computer. Thus, for example, a program file as installed on a single-processor computer may be slightly different than the “same” program file as installed on a multi-processor computer. Unless the precise contents of the file as installed on the user's computer are known, patching is a risky business.
When a software vendor knows that there are several different versions of a file to be updated, the vendor may publish a multi-version patch file. Such a patch file can be a concatenation of several different sets of patching instructions, each one applicable to a different version of the file. The drawback of this approach is that half, or more, of the patch file is superfluous data—inapplicable to the file stored on a particular user's computer. Thus, its download time is far longer than is really necessary.
Another type of multi-version patch file has a general set of patching instructions (for code that is consistent through all versions of the old file), together one or more specialized sets of patching instructions (for code that is different between different versions of the old file). Branch instructions in the patching file examine particular characteristics of the old file, and apply the appropriate set of specialized patching instructions.
Again, this approach suffers by reason of more patching data than is needed for any given user.
In accordance with a preferred embodiment of the present invention, the foregoing and additional drawbacks of the prior art are overcome. The two distinct operations of pattern matching and compression (performed on the vendor's computer in prior art patch generation techniques) are replaced by a single operation that both compares old and new file versions, and produces a compressed output by which the latter can be generated from the former. Likewise, the two distinct operations of decompression and patch instruction application (performed on the user's computer in the prior art) are replaced by a single operation that both decompresses the patch file data and results in recreation of the new file. The patch file generated and used in these processes is of considerably reduced size—sometimes half the size of compressed patch files produced by prior art approaches.
In the preferred embodiment, these advantages are achieved by use of compression/decompression processes in which the compressor (and decompressor) is pre-initialized in accordance with the old version of the file being updated. In implementations using LZ77-type compression, this pre-initialization takes the form of preloading the respective compressor/decompressor history windows with the old version of the file. On the vendor side, the new file is applied to the pre-initialized compressor, yielding the patch file as output. The compressor both identifies redundancies between the new file and the old file (with which the compressor's history window has been preloaded), and provides a highly compressed output. On the user's side, the patch file is decompressed using a parallel process.
(LZ77 is a form of adaptive dictionary compression named after Lempel/Ziv's 1977 paper “A Universal Algorithm for Sequential Data
Forbes Jonathan A.
McGuire Thomas D.
Sliger Michael V.
Dam Tuan Q.
Klarquist & Sparkman, LLP
Microsoft Corporation
LandOfFree
File update performing comparison and compression as single... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with File update performing comparison and compression as single..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and File update performing comparison and compression as single... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2975376