Image analysis – Pattern recognition – Context analysis or word recognition
Reexamination Certificate
1998-07-09
2001-06-26
Au, Amelia M. (Department: 2623)
Image analysis
Pattern recognition
Context analysis or word recognition
C382S226000, C382S161000, C707S793000
Reexamination Certificate
active
06252988
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates to document image processing, and more particularly, to recognizing and enhancing the images from an image source, for example, a printed document.
BACKGROUND OF THE INVENTION
A fundamental problem in the art of automatic document image processing relates to image defects, that is, imperfections in the image as compared to the original ideal artwork used to create the image. The sources of image defects are numerous and well-known. For example, the original printed document (e.g., paper document) which was the source of the image may be defective (e.g., the paper has spots of dirt, folds, or was printed from a faulty printing device.) Further, when the paper document was scanned, the paper may have been skewed while being placed in the scanner, resulting in a distortion of the image. In addition, the optics of the scanning process itself can produce defects due to, for example, vibration, pixel sensor sensitivity or noise.
The above-mentioned image defects result in poor display quality of the image and are a particular problem in document image processing because of the character recognition accuracy required in the automatic processing of documents. For example, optical character recognition (“OCR”) is often an integral part of an image processing system. OCR is the process of transforming a graphical bit image of a page of textual information into a text file which can be later edited, for example, using word processing software. As is well-known in the art, image classifiers are key components of most OCR systems used for analyzing a digital representation of an image. The accuracy of such classifiers significantly decreases when the quality of the image source is degraded even slightly.
Training classifiers to recognize images having a wide range of shape variations and/or image degradations is a well-known challenge in OCR. One technique, the so-called adaptive OCR strategy, trains the classifier only for the fonts and degradation conditions which are present in a given image, e.g., a printed text page. Thus, this adaptive OCR strategy requires some knowledge of the dominant font and defects in the given image. Some previously known adaptive OCR techniques represent such knowledge implicitly through character prototypes extracted directly form the image. For example, G. Nagy et al., “Automatic Prototype Extraction for Adaptive OCR”,
Proceedings of the Fourth International Conference on Document Analysis and Recognition,
Ulm, Germany, Aug. 18-20, 1997, pp. 278-282 (hereinafter “Nagy”), and A. L. Spitz, “An OCR Based on Character Shape Codes and Lexical Information”,
Proceedings of the
3
rd International Conference of Document Analysis and Recognition,
Montreal, Canada, Aug. 14-18, 1995, pp. 723-728, describe two such character prototyping techniques. Nagy's character prototype technique employs truth labels, or the so-called “ground truth”, as input which are derived from a small segment of the actual image to be recognized. The ground truth selected from the image, e.g., text, in accordance with Nagy's technique is actually keyed in to the system by a user. Using the ground truth, a matching occurs between pairs of words from the image and the ground truth to determine matching characters and to estimate the position of each character within each word (see, e.g., Nagy, supra., p. 278.)
While the above-described adaptive OCR techniques are useful in character recognition, the reliance on ground truth and the derivation of such ground truth directly from the image to be recognized does present certain disadvantages. In particular, prior to any classification of the image, the ground truth must be selected, processed and inputted into the OCR system for each image to be recognized. Thus, certain preprocessing overhead is inherently associated with these types of ground truth based adaptive OCR techniques.
Therefore, a need exists for a adaptive OCR technique for character recognition without reliance on ground truth derived from the image itself and provided as input to the OCR system prior to classification and recognition.
SUMMARY OF THE INVENTION
The present invention provides an adaptive OCR technique for character classification and recognition without the input and use of ground truth derived from the image itself In accordance with the invention, a set of so-called stop words are employed for classifying symbols, e.g., characters, from any image. The stop words are identified independent of any particular image and are used for classification purposes across any set of images of the same language, e.g., English. Advantageously, in accordance with the invention, an adaptive OCR method is realized without the requirement of the selection and inputting of ground truth from each individual image to be recognized.
More particularly, in accordance with the preferred embodiment of the invention, adaptive image recognition is initiated by comparing the image, e.g., a text page image, to the set of stop words to determine a matching and the identification of a set of recognized words. In accordance with the preferred embodiment of the invention, the classification between the stop words and the image is facilitated by a stop word classifier constructed as a function of a decision forest. The set of recognized words are then aligned for the extraction of character prototypes. In accordance with the preferred embodiment, the extraction of character prototypes comprises four steps: (
1
) character width estimation; (
2
) word shifting; (
3
) common character extraction; and (
4
) bitmap averaging. After obtaining the character prototypes from the extraction operations of the preferred embodiment of the invention, a recursive segmentation operation is applied to completely segment the recognized words. The character prototypes obtained as a function of the stop words are then used to train a classifier for use by an OCR system for recognizing the subject image.
REFERENCES:
patent: 4975975 (1990-12-01), Filipski
patent: 5375176 (1994-12-01), Spitz
patent: 5513304 (1996-04-01), Spitz et al.
patent: 5638543 (1997-06-01), Pederson et al.
patent: 5825925 (1998-10-01), Baird et al.
patent: 5862259 (1999-01-01), Bokser et al.
patent: 5909510 (1999-06-01), Nakayama et al.
Lebourgeois et al, “An Evolutive OCR System Based on Continuous Learning”; IEEE Proceedings on Application of Computer Vision, ISBN: 0-8186-7620-5; pp. 272-277, Dec. 1996.*
G. Nagy, “At the Frontiers of OCR”; IEEE Proceedings, ISSN: 0018-9219; vol. 80, Issue 7, pp. 1093-1100, Jul. 1992.*
George W. Hart, “To Decode Short Cryptograms”; Communications of the ACM, vol. 37, No. 9, pp. 102-107, Sep. 1994.*
G. Nagy, Y. Xu, “Automatic Prototype Extraction For Adaptive OCR,” Proceedings of the Fourth International Conference on Document Analysis and Recognition, Ulm, Germany, Aug. 18-20, 1997, pp. 278-282.
A. L. Spitz, “An OCR Based On Character Shape Codes and Lexical Information,” Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, Canada, Aug. 14-18, 1995, pp. 723-728.
G. Nagy, Y. Xu, “Priming the Recognizer,” Proceedings of the IARP Workshop on Document Analysis Systems, Malvern, PA, Oct. 14-16, 1996, pp. 263-281.
T. Hong, “Integration Of Visual Inter-Word Constraints and Linguistic Knowledge In Degraded Text Recognition,” Proceedings of 32nd Annual Meeting of Association for Computational Linguistics, student session, Las Cruces, New Mexico, Jun. 1994, pp. 328-330.
T. Hong, J. J. Hull, “Improving OCR Performance With Word Image Equivalence,” Fourth Annual Symposium on Document Analysis and Information Retrieval (DAIR95), Las Vegas, Nevada, Apr. 1995, pp. 1-21.
G. Nagy, Y. Xu, “Bayesian Subsequence Matching And Segmentation,”Elsevier Science,Nov. 4, 1997, pp. 1-8.
G. E. Kopec, M. Lomelin, “Document-Specific Character Template Estimation,” Proceedings of SPIE, vol. 2660, San Jose, California, 1996, pp. 14-26.
F.R. Chen et al., “Extraction of Thematically Relevant Text,”Proc. of the 5th Ann. Symp. on Document Anal
Au Amelia M.
Dastouri Mehrdad
Dinella Donald P.
Lucent Technologies - Inc.
LandOfFree
Method and apparatus for character recognition using stop words does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for character recognition using stop words, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for character recognition using stop words will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2463960