Classification-driven thresholding of a normalized grayscale...

Image analysis – Pattern recognition – Template matching

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C382S217000, C382S220000, C382S229000

Reexamination Certificate

active

06266445

ABSTRACT:

TECHNICAL FIELD
The present invention relates generally to image recognition, and more particularly, to a method and system for optical character recognition by classification-driven thresholding of a normalized grayscale image.
BACKGROUND ART
In the art of optical character recognition, an image classifier is a functional unit that attempts to match sample images against a set of referent images or templates. Although most character images are sampled in grayscale, which results in multiple data bits per image pixel, image classifiers are generally limited to binary (bi-level) input data. Analyzing grayscale data is substantially more complicated, and requires time-consuming, sophisticated techniques. Thus, although some grayscale classifiers exist, most readily-available image classifiers accept only binary input data. A variety of binary classifiers for optical character recognition are known in the art, such as the system described in U.S. Pat. No. 5,539,840 to Krtolica et al. for “Multifont Optical Character Recognition Using a Box Connectivity Approach,” which is incorporated herein by reference.
Because the sample image comprises grayscale data, but the image classifier accepts only binary data, the sample image must be converted initially from grayscale into black and white. This step normally requires a process called thresholding or binarization, which includes selecting a median gray level (usually called a “binarization threshold” or “threshold”) and changing the value of each image pixel to either zero or one, depending on whether the original gray level of the pixel had a value greater or less than that of the threshold. In conventional systems, binarization of the sample image is generally performed once, using a single threshold, after which the binary output data is provided to the image classifier.
As conventionally implemented, however, thresholding often dramatically reduces recognition accuracy. When an image is thresholded, much useful information about the image is lost. For example, an eight bit grayscale image contains eight times more data than the same thresholded image. Such data assist the human eye in recognizing the image, but are lost to conventional image recognition systems because of thresholding.
In addition, thresholding introduces harmful noise into the image. Slight deviations in the image's gray levels are often manifest after thresholding in the form of jagged edges, stray pixels, gaps, and other artifacts that reduce recognition accuracy. Moreover, after thresholding, the sample image is typically normalized to the size of the referent images. However, normalizing binary data generally compounds the noise, reducing recognition accuracy to an even greater degree. What is needed, then, is a method and system for providing binary data to a binary image classifier while retaining as much information as possible about the original grayscale image and reducing the noise associated with the processes of thresholding and normalization.
As noted earlier, in conventional systems, thresholding is normally performed as a separate step from image classification. Thus, in such systems, thresholding is merely a simplification or quantizing step. However, as shown in
FIG. 1
, thresholding is central to classification and is not so easily separable therefrom. For example, matrix (a) of
FIG. 1
represents a grayscale image sampled at eight bits (256 gray levels) per pixel. If the binarization threshold (“T”) is selected to be 128, matrix (b) illustrates the resulting binary image, which would be interpreted by a binary image classifier as the letter “U.” If, however, the threshold is selected to be 140, matrix (c) illustrates the resulting binary image, which would be interpreted to be the letter “L.” Both interpretations are valid. However, in each case, the selection of the binarization threshold determines which pixels are in the foreground (“1”) and which pixels are in the background (“0”). Thus, the thresholding step effectively determines the classification of the image.
The foregoing situation often occurs where there is poor contrast between the foreground and background, and where the foreground or background gray levels are not uniform throughout the sampled image. The human eye can easily compensate for these anomalies. However, a conventional image recognition system that separately thresholds the image before classification will frequently produce inaccurate results. Indeed, as shown above, an arbitrary selection of either threshold will often eliminate valid, and possibly correct, interpretations of the character image.
Conventionally, a binary image classifier cannot detect such alternative interpretations based on different thresholds since the thresholding step is performed separately from classification. If thresholding could be performed with a foreknowledge of the referent images, then a number of possible interpretations of the sample image, based on different thresholds, could be determined. Moreover, only those interpretations having an acceptable “distance” from the binarized sample image could be selected.
What is needed, then, is a method and system for integrating the thresholding and classification steps such that a number of interpretations of the image are found using different thresholds. Moreover, what is needed is a method and system for selecting an interpretation wherein the distance between the binarized sample and the referent image is minimized. Hereafter, this process is called “classification-driven thresholding.” What is also needed is a method and system for performing classification-driven thresholding in an efficient manner, without having to resort to exhaustive comparison of all possible thresholded images with the set of referent images. Finally, what is needed is a method and system for disambiguating a candidate set by selecting a preferred interpretation of the character image.
DISCLOSURE OF INVENTION
The present invention addresses the aforementioned problems of conventional image recognition systems by providing a method and system for image recognition by classification-driven thresholding of a normalized grayscale image. In accordance with the present invention, a sample image (
142
) is recognized by normalizing (
404
) the size of the sample image (
142
) to the size of the referent images (
146
); and determining (
406
) a set of candidate images (
147
) from the set of referent images (
146
), wherein each of the candidate images (
147
) is within an acceptable distance from a different binarization (
145
) of the sample image (
142
).
In accordance with the present invention, a system (
120
) for image recognition includes a scanning device (
126
), a normalization unit (
134
), a distance calculation unit (
136
), a classification unit (
138
), a disambiguation unit (
140
), and a display device (
128
).


REFERENCES:
patent: 5081690 (1992-01-01), Tan
patent: 5307424 (1994-04-01), Kuehl
patent: 5465308 (1995-11-01), Hutchenson et al.
patent: 5818952 (1998-10-01), Takenouchi et al.
patent: 5850480 (1998-12-01), Scanlon
patent: 5875264 (1999-02-01), Carlstrom
patent: 5987170 (1999-11-01), Yamamoto et al.
patent: 5999664 (1999-12-01), Mohoney et al.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Classification-driven thresholding of a normalized grayscale... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Classification-driven thresholding of a normalized grayscale..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Classification-driven thresholding of a normalized grayscale... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2508810

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.