Electrical computers and digital data processing systems: input/ – Input/output data processing – Peripheral adapting
Reexamination Certificate
1998-03-06
2001-06-26
Lee, Thomas (Department: 2182)
Electrical computers and digital data processing systems: input/
Input/output data processing
Peripheral adapting
C710S065000, C341S051000, C341S107000
Reexamination Certificate
active
06253264
ABSTRACT:
BACKGROUND OF THE INVENTION
High performance data compression systems use models of the data to increase their ability to predict values, which in turn leads to greater compression. The best models can be achieved by building a compression system to support a specific data format. Instead of trying to deduce a crude model from the data within a specific file, a format-specific compression system can provide a precise pre-determined model. The model can take advantage not just of the file format structure, but also of statistical data gathered from sample databases.
Previous efforts at format-specific compression have been focused on solutions to a few individual formats rather than on the development of a generalized method that could be adapted to many formats. The models which have been created typically involve a small number of components. This works adequately when most of the data is included in a few components, such as an image file having mostly red, blue, and green pixel values. But may formats are best modeled using a large number of components, and the previous systems are not designed to build or encode such models.
SUMMARY OF THE INVENTION
A preferred coding system in accordance with the invention solves both of these problems in the prior art: it provides a generalized solution which can be adapted to handle a wide range of formats, and it effectively handles models that use large numbers of components. The system involves new approaches at many levels: from the highest (interface) level to the lowest (core encoding algorithms) level. The coding network includes a compression network for encoding data and a decompression network for decoding data.
At the highest level, a preferred compression system in accordance with the invention uses an architecture called a Base-Filter-Resource (BFR) system. This approach integrates the advantages of format-specific compression into a general-purpose compression tool serving a wide range of data formats. The system includes filters which each support a specific data format, such as for Excel XLS worksheets or Word DOC files. The base includes the system control modules and a library used by all the filters. The resources include routines which are used by more than one filter, but which are not part of the base. If a filter is installed which matches the format of the data to be encoded, the advantages of format-specific compression can be realized for that data. Otherwise, a “generic” filter is used which achieves performance similar to other non-specific data compression systems (such as PKZip, Stacker, etc.).
At the next level, the system preferably includes a method for parsing source data into individual components. The basic approach, called “structure flipping,” provides a key to converting format information into compression models. Structure flipping reorganizes the information in a file so that similar components that are normally separated are grouped together.
Upon this foundation are a numbers of tools, such as:
a language to simplify the creation of parsing routines;
tools to parse the source data using this method into separate components; and
tools to generate models for individual components by automated analysis of sample data bases.
These tools can be applied to the filters for a wide range of file and data types.
The compression system preferably includes tools called customized array transforms for specific filters and for certain types of filters. These techniques handle a specific file type or certain data constructions used by several file types, such as encoding two dimensional arrays of data as found in many database formats.
At the low-level of the system are preferably a number of mechanisms for encoding data arrays, including:
new low-level encoding algorithms;
methods for integrating a large number of transforms and encoding algorithms;
methods for eliminating overhead so that small data blocks can be efficiently coded; and
a new method for integrating both static models (determined from statistical analysis of sample databases) and dynamic models (adapted to the data within a particular array) into the encoding of each component.
A preferred method of encoding source data comprising parsing the source data into a plurality of blocks. The parsed blocks typically have a different format than the source data format. In particular, similar data from the source data are collected and grouped into respective blocks.
For each block, a compression algorithm is selected from a plurality of candidate compression algorithms and applied to the block. The compression algorithms can be determined based on the amount of data in the respective block. Furthermore, the compression algorithm can be adapted to the respective block, including the use of a customized transform. The selection of an algorithm can also be based on a compression model, which is derived from the format of the source data. The compressed data from the plurality of blocks are then combined into encoded data.
The coding network can also include a decompression network to convert the encoded data back into the source data. First the data is decoded and then the parsing is reversed. In a lossless system, the resulting data is identical to the source data.
Embodiments of the invention preferably take the form of machine executable instructions embedded in a machine-readable format on a CD-ROM, floppy disk or hard disk, or another machine-readable distribution medium. These instructions are executed by one or more processing units to implement the compression and decompression networks.
REFERENCES:
patent: 5293379 (1994-03-01), Carr
patent: 5467087 (1995-11-01), Chu
patent: 5617541 (1997-04-01), Albanese et al.
patent: 5632009 (1997-05-01), Rao et al.
patent: 5638498 (1997-06-01), Tyler et al.
patent: 5684478 (1997-11-01), Panaoussis
patent: 5708828 (1998-01-01), Coleman
patent: 5710562 (1998-01-01), Gormish et al.
patent: 5838964 (1998-11-01), Gubser
patent: 5864860 (1999-01-01), Holmes
patent: 5867114 (1999-02-01), Barbir
patent: 5881380 (1999-03-01), Mochizuki et al.
patent: 5983236 (1999-11-01), Yager et al.
patent: 5991515 (1999-11-01), Fall et al.
patent: 0 729 237 A2 (1996-08-01), None
Rice, R.F., et al., “VLSI Universal Noiseless Coder,”NTIS Tech Notes 2301, p. 234 (Mar. 1,1990).
Diner, D.B., et al., “Competitive Parallel Processing for Compression of Data,”NTIS Tech Notes 2301, p. 379 (May 1, 1990).
Hamilton Brook Smith & Reynolds P.C.
Intelligent Compression Technologies
Lee Thomas
Nguyen Tanh
LandOfFree
Coding network grouping data of same data type into blocks... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Coding network grouping data of same data type into blocks..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Coding network grouping data of same data type into blocks... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2490783