Image analysis – Pattern recognition – Feature extraction
Reexamination Certificate
1997-02-26
2001-02-13
Coles, Edward L. (Department: 2722)
Image analysis
Pattern recognition
Feature extraction
C382S177000, C382S185000, C382S217000, C382S274000
Reexamination Certificate
active
06188790
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a method and apparatus for reading and recognizing characters, printed or handwritten, on a sheet of paper.
2. Description of the Related Art
Japanese Unexamined Patent Publication Nos. 64-78395 and 5-108882 disclose apparatuses for reading and recognizing characters printed or handwritten on a sheet of paper. To execute character recognition, image data of a character read by a CCD (Charge Coupled Device; solid image sensing elements) in a reading section is converted to binary image data by a binarization section, and a character area is extracted from this binary image. The extracted character area is then segmented to mesh areas in a matrix form (e.g., 8×8).
For each mesh area, the ratio of the area of black pixels to the area of the mesh area, or the density, is acquired. The density distribution of mesh areas in a character area represents the characteristics of the character pattern. Character recognition is performed by comparing the density distribution for a character area with the density distributions of character patterns in a previously prepared dictionary based on the characteristics.
Handwritten characters and printed characters vary in size and shape even if they are the same character. To facilitate comparison of a handwritten or printed character with character patterns in the dictionary, therefore, size and outline shaping processing or normalization is performed.
Normalization has been accomplished mainly in two ways. One is to normalize a rectangle circumscribing a character and the other is to normalize a square circumscribing a character. As shown in
FIG. 9
, the former method normalizes a character by forming a circumscribing rectangle F with respect to the character pattern L of a binary image and converting the circumscribing rectangle F to a specified area S of a predetermined square size. According to this method, even if the input characters differ in size and shape, their character patterns L, after normalization, become substantially the same in size and shape as shown at the lower portions in FIGS.
9
(
a
) and
9
(
b
), thus ensuring a constant aspect ratio (the ratio of the vertical size to the horizontal size). This scheme is advantageous in that the number of character patterns in the dictionary can be reduced.
In the case of a vertically elongated character such as “1” and a horizontally elongated character such as “−” as shown in FIGS.
9
(
c
) and
9
(
d
), however, the normalization based on a circumscribing rectangle fills most of the specified area S with black pixels as shown in the lower portions in those drawings. This makes character recognition difficult.
However, as shown in
FIG. 10
, the other method for normalizing a character based on a circumscribing square normalizes a character by forming a circumscribing square “A” with respect to the character pattern L of a binary image and converting the circumscribing square “A” to a specified area S. Space, or white pixel area, is added to the sides of the character or above and below a character, depending on whether the character is elongated vertically or horizontally. This method therefore overcomes the aforementioned shortcoming of the normalization that is based on a circumscribing rectangle.
When a character like “7” or “9” is written significantly long vertically by, for example, a hand, however, adding space to the lateral sides of the character at the time of normalization may yield a deformed character that is difficult to recognize. The same applies to characters that are significantly horizontally elongated. Since the shape of a read character directly reflects on the character area L after normalization in the circumscribing-square based normalization scheme as shown at the lower portions in FIGS.
10
(
a
) and
10
(
b
), there are various sizes and shapes for normalized character patterns L that cannot be classified. This necessitates the preparation of many character patterns in the dictionary.
A solution to this problem was proposed in, for example, Japanese Unexamined Patent Publication No. 5-108882, which discloses an apparatus for extracting a character area in the form of a circumscribing rectangle with respect to the character pattern of a binary image and changing the number of segments of the character area in accordance with the aspect ratio of the character area. With regard to vertically elongated characters, this method increases the number of segments in the vertical direction and reduces it in the horizontal direction. For horizontally elongated characters, the number of segments is reduced vertically and is increased horizontally.
Even a character written significantly long in either direction is normalized without being deformed, and it is easily recognized. This character recognition apparatus however needs exclusive dictionaries for different ways of segmentation and thus suffers an increased number of dictionaries and an increased number of character patterns to be stored in the dictionaries.
SUMMARY OF THE INVENTION
Accordingly, it is an object of the present invention to provide a character recognition method and apparatus capable of substantially grouping character patterns after normalization irrespective of differences in size and shape between characters such as those apparent between vertically elongated characters and horizontally elongated characters, thus ensuring a reduction in the number of character patterns to be stored in a dictionary.
To achieve this object, this invention provides a novel method and apparatus for recognizing characters read by a reading unit. The circumscribing rectangle of a read character is formed and the degree of flatness of that circumscribing rectangle is acquired. A character of which degree of flatness is equal to or greater than a predetermined value is selected and a blank area is added to the circumscribing rectangle of the selected character to yield a character area with a corrected degree of flatness. The character is normalized by converting the character area to a specified size and is recognized based on the normalized character. It is therefore possible to normalize even characters significantly elongated vertically or horizontally for easier recognition and to group their character patterns.
REFERENCES:
patent: 3605093 (1971-09-01), Parks et al.
patent: 3810093 (1974-05-01), Yasuda et al.
patent: 4259661 (1981-03-01), Todd
patent: 4379283 (1983-04-01), Ito et al.
patent: 5151951 (1992-09-01), Ueda et al.
patent: 5197107 (1993-03-01), Katsuyama et al.
patent: 5293256 (1994-03-01), Fukushima et al.
patent: 5621818 (1997-04-01), Tashiro
patent: 1-78395 (1989-03-01), None
patent: 4-177485 (1992-06-01), None
patent: 4-260987 (1992-09-01), None
patent: 5-108882 (1993-04-01), None
patent: 6-44409 (1994-02-01), None
Horii Hiroshi
Kawajiri Hiromitsu
Tanaka Junji
Yoshikawa Takatoshi
Coles Edward L.
Rosenblum David
Ross P.C. Sheridan
Tottori Sanyo Electric Ltd.
LandOfFree
Method and apparatus for pre-recognition character processing does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for pre-recognition character processing, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for pre-recognition character processing will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2594110