Image analysis – Histogram processing – With pattern recognition or classification
Reexamination Certificate
1997-11-07
2001-10-23
Johnson, Timothy M. (Department: 2623)
Image analysis
Histogram processing
With pattern recognition or classification
C382S224000, C382S237000
Reexamination Certificate
active
06307962
ABSTRACT:
FIELD OF INVENTION
The present invention relates to a system (method and apparatus) for document data compression, and particularly, to a system for document data compression which automatically segments a document into segments, classifies these segments as different types of image information, and then compresses the document based upon the segment classification to generate a compressed document (referred to herein as a smart document since the compressed data generally reflects knowledge of the character of the image on the document). The system is further capable of rendering a reproduction of the document from data representing the smart document.
BACKGROUND AND ADVANTAGES OF THE INVENTION
Digital documents are generated every time a printed page or film is received by a facsimile machine, scanner, digital photocopier, or other similar digital input devices. These digital documents are composed of an array of pixels with values representing gray scale. Generally, these digital documents contain different types of image information, such as text having different background and foreground gray scale values, continuous tone images, graphics, and halftone images, which may be mixed on a document page.
Conventional facsimile machines operate on the data of digital documents to provide a representation of the document suitable for transmission and subsequent rendition by a receiving facsimile machine. These operations are often referred to as rendition methods including, for example, ordered dithering, error diffusion, and binarization (bi-level quantization). By applying these rendition methods a bit map representation of the document is formed. Typically, facsimile machines operate on digital documents by applying a single rendition method for the entire image. This fails to adequately reproduce documents having mixed types of image information because not all image types can properly be reproduced by the same rendition method. For example, binarizing may be proper for text images, but when applied to continuous tone images, gray scale transitions of the image are lost. Further, applying ordered dithering or error diffusion can halftone a continuous tone image, but applying such methods to text images causes the edge of text to blur, which sometimes results in text being illegible. Thus, applying an improper rendition method to different image components of a document produces distortions which degrade reproduction quality.
In addition, facsimile machines may compress and decompress a rendered bit map representation of documents by Group 3 or Group 4 standards. Examples of Group 3 and Group 4 standards are described in: CCITT, “Recommendation T.4, Standardization of Group 3 facsimile apparatus for document transmission,” Vol. VII-Fascicle VII.3, 21-47; and, CCITT, “Recommendation T.6, Facsimile coding schemes and coding control functions for Group 4 facsimile apparatus,” Vol. VII-Fascicle VII.3, 48-57. However, although data compression may be performed, poor reproduction of mixed image type document is maintained.
To improve reproduction quality, digital documents can be segmented into their image components. The resulting segments can then be classified as to image type, and different rendition methods applied to segments based on their type. Many of the proposals for segmenting a document heretofore presented are oriented towards analyzing different information in a mixed document, such as for optical character recognition (OCR) purposes.
These approaches include such methods as recursive X-Y cut (RXYC), and constrained run-length algorithm (CRLA), which is also referred to as run length smoothing algorithm (RLSA). The following literature describes RXYC: G. Nagy, S. Seth, and S. D. Stoddard, “Document analysis with an expert system,” Proc. Pattern Recog. in Practice, Amsterdam, Jun. 19-21, 1985, Vol. II; and, P. J. Bones, T. C. Griffin, C. M. Carey-Smith, “Segmentation of document images,”
SPIE Vol
1258
Image Communications and Workstations
, 78-88, 1990. CRLA is described in: F. M. Wail, K. Y. Wong, and R. G. Casey, “Block segmentation and text extraction in mixed text/image documents,”
Comput. Vision Graphics Image Process
., vol. 20, 375-390, 1982; B. S. Chien, B. S. Jeng, S. W. Sun, G. H. Chang, K. H. Shyu, and C. S. Shih, “A novel block segmentation and processing for Chinese-English document,”
SPIE Vol
. 1606
Visual Communications and Image Processing
'91
: Image Processing
, 588-598, 1991; T. Pavlidis and J. Zhou, “Page segmentation and classification,”
CVGIP: Graphical Models and Image Processing
, Vol. 54, No. 6, November 484-496, 1992; P. Chauvet, J. Lopez-Krahe, E. Taflin, and H. Maitre, “System for an intelligent office document analysis, recognition and description,”
Signal Processing
, Vol. 32, 161-190, 1993.
RXYC and CRLA both assume an alignment of digital documents and rectangular sized segments. Accordingly, these methods have strong directional preferences, and require processing to correct improper document segmentation due to non-rectangular segments and skewing of segments from the assumed alignment. Moreover, tilting of image components for their assumed alignment in the document may result in segments having mixed image types. It would therefore be desirable to perform document segmentation which is not subject to the above limitation of document alignment or rectangular shaped segments.
Several other segmenting proposals have been oriented towards document rendition, such as performed in facsimile machines, rather than document analysis. Examples of these segmentation proposals are contained in the following publications: Y. Chen, F. C. Mintzer, and K. S. Pennington, “A binary representation of mixed documents (text/graphic/image) that compresses,” ICASSP 86, 537-540, 1986; M. Yoshida, T. Takahashi, T. Semasa, and F. Ono, “Bi-level rendition of images containing text, screened halftone and continuous tone,”
Globecom
'91, 104-109, 1991; and, S. Ohuchi, K. Imao, and W. Yamada, “A segmentation method for composite text/graphics (halftone and continuous tone photographs) documents,”
Systems and Computers in Japan
, Vol. 24, No. 2, and 35-44, 1993.
In Ohuchi et al., a digital document is first subdivided into non-overlapping 4×4 pixel blocks. A block is considered a halftone block if gray level peaks appear in pixels of blocks neighboring the block. A first mask is created for the document by combining the halftone blocks to detect halftone areas. A second mask is then generated by quantizing the pixels of the document into three levels, detecting continuous black and white pixels by pattern matching of a 5×5 pixel block, and activating the block as an edge area once a desired pattern is detected. The two masks determine the classification of pixels. Text areas of the document are based on edge areas of the second mask and the non-halftone areas of the first mask. All areas which are not text are considered graphics. Graphics are halftoned by dithering or error diffusion, are then the document is binarized.
In Chen et al., a digital document is first subdivided into non-overlapping 4×4 pixel blocks. Each block is classified as text or image as follows: Two sets of four pixels are selected of a block. If any of the four pixels in each set has a gray level valve above a white threshold, the block is text. If two selected pixels from each set are below a black threshold, the block is also text. Blocks not classified as text are classified as image. Runs of horizontal image blocks shorter than 12 blocks are reclassified as text blocks. Pixels in text blocks are binarized into a first bit map, and pixels in image blocks are halftoned by error diffusion.
Further, in Yoshida et al., a digital document is segmented by first classifying each pixel as a screened or unscreened halftone pixel. The middle pixel of a 5×3 pixel block is classified by binarizing the pixels in the block based upon a threshold value of the average of the central 3×3 pixels, counting the number of transitions in both horizontal and ve
Fung Hei Tao
Parker Kevin J.
Johnson Timothy M.
Lukacher Kenneth J.
The University of Rochester
LandOfFree
Document data compression system which automatically... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Document data compression system which automatically..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Document data compression system which automatically... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2610584