Image analysis – Pattern recognition – Classification
Reexamination Certificate
2000-06-09
2004-05-18
Dastouri, Mehrdad (Department: 2721)
Image analysis
Pattern recognition
Classification
C382S218000, C382S194000, C382S178000
Reexamination Certificate
active
06738519
ABSTRACT:
CROSS REFERENCE TO RELATED APPLICATIONS
The present invention claims priority from Japanese Patent Application No. 11-165358 filed Jun. 11, 1999, the contents of which are incorporated herein by reference.
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a character recognition apparatus for automatically reading a character string from an inputted image data. Concretely, the present invention relates to an apparatus for reading such as a residence address, name or product number written on a mail and document, and to a pen-inputting apparatus for inputting a character string by a stylus pen.
2. Description of Related Art
There will be firstly explained a conventional character recognition apparatus with reference to
FIGS. 9 and 10
.
FIG. 9
is a constitutional block diagram of essential parts of the conventional character recognition apparatus.
FIG. 10
is a view for explaining how to recognize a character by the conventional technique. Herein, there will be particularly described the conventional technique concerning a character recognition apparatus aiming at reading a character string constituted of a plurality of characters, i.e., a word(s).
As shown in
FIG. 9
, in case of reading a character string constituted of a plurality of characters like a word such as a residence address and name as well as a product code making use of a character segmentation part
301
serving as character segmentation means for segmenting character by character and a single-character recognizing part
401
serving as character recognizing means for recognizing character by character, it is unlikely that all of the character patterns segmented at the character segmentation part
301
are precisely recognized at the single-character recognizing part
401
. As such, by providing a “word” dictionary
602
serving as word data storing means for storing words to be recognized, it becomes possible to improve a “word-wise” recognition performance, by searching, from the word dictionary
602
via word verifying part
601
, that word which has the largest number of matched characters for the recognition result, even if some characters of the recognized character string are not correctly recognized at the single-character recognizing part
401
. Examples of character recognition apparatus having such a constitution are disclosed in Japanese Patent Application Laid-Open No. HEI-2-109187 titled “Post-Processing Method of Closely Written Addresses” (hereinafter called “reference
1
”), and Japanese Patent Application Laid-Open No. HEI-5-114053 titled “Post-Character-Recognition Processing Method” (hereinafter called “reference
2
”).
However, in those methods based on verification such as disclosed in the reference
1
and reference
2
, erroneous correction will occur upon verification, as the inter-character contact increases in the character string to be recognized or the number of words to be recognized increases. This is because the character segmentation part
301
does not previously hypothesize the number of characters to be segmented; and particularly because a lot of candidates for the word to be recognized are enumerated from the recognition result for character candidate patterns obtained as the segmentation result when a lot of inter-character contacts are included in the original character string, resulting in difficulty in narrowing down the candidates into a correct answer.
Contrary, such as shown in F. Kimura et al., “A Lexicon Directed Algorithm for Recognition of Unconstrained Handwritten Words”, IEICE Trans. INF. & SYST., Vol.E77-D, No. 7 (1994.7) (hereinafter called “reference
3
”), there exists a recognition method in which: several words are previously hypothesized for the original word to be recognized, character candidate patterns are generated for respective hypothetical words by segmenting characters from the original word based on the respective numbers of characters included in the hypothetical words, and character recognition is individually performed for the respective character candidate patterns, to thereby resultingly decide how the recognition result is close to the hypothetical word, making use of a magnitude of word certainty level to be expressed by a sum or product of the reliability levels of the individual character recognition for each of the character candidate patterns. However, even this method has such a defect that: when there exist, for the word to be recognized, two similar hypothetical words one of which includes a “single character ” which is different from the corresponding “single character” of the other, these hypothetical words may not be distinguished from each other. This is because the decision is done based on the evaluation value for the whole of the character string so that occurrence of lower certainty levels of recognition result for the character string is not suitably considered.
In each of the word recognition methods disclosed in the references
1
through
3
, those portions of the character candidate patterns obtained by character-segmentation, which portions are unrecognizable or which portions have lower certainty levels, are supplemented by jointly using a word information making use of the suitably recognized portions. At this time, those portions, which are unrecognizable or have a lower certainty level, i.e., which have been read-wise skipped, are not checked as to whether the characters supplemented by the word verification are really correct or not, resulting in the aforementioned misrecognition of similar words.
Meanwhile, such as disclosed in Japanese Patent No. 2734386 titled “Character String Recognition Apparatus” (hereinafter called “reference
4
”), in order to avoid misrecognition due to the aforementioned read-wise skipping, those read-wise skipped portions of the character candidate patterns are re-recognized by a method different from the initially utilized character recognition method. In this way, there can be prepared the character recognition results for all of the characters constituting the word, to thereby expel ambiguous portions therefrom, resulting in reduction of misreading. In this method, however, since the check for those read-wise skipped portions is performed by individual character recognition, it is necessary that the characters have been segmented character by character. For example, when it is intended to perform the aforementioned re-recognition on a portion where a contact of two characters has occurred, it is required that the portion has been correctly segmented into two pieces of character candidate patterns. Otherwise, i.e., when the image corresponding to two characters has not been correctly divided into two pieces of patterns, misrecognition will occur.
In the conventional examples as described above with respect to the references
1
through
4
, the re-recognition as a check can not be performed, when the ambiguous portion upon jointly using the word information, i.e., the skipped portion, includes a contact of two or more of characters and such a portion has not been correctly segmented resulting in recognition of the strictly most “similar” word in the word verification. As a result, misrecognition is problematically caused, such as in case of existence of another word having a different portion only which has been accidentally skipped. Similar problems are caused in those character recognition techniques as disclosed in Japanese Patent Application Laid-Open Nos. HEI-3-48379, HEI-3-154985, HEI-5-290217, HEI-7-192094 and Japanese Patent No. 2619499, in addition to the conventional examples as explained concerning the references
1
through
4
.
FIG. 10
shows a concrete example thereof. In this figure, the correct answer is the word “Hundred”. However, the recognition result becomes “Th????d”, in case of using only the character segmentation means and the character recognizing means, before performing the word verification. In case of performing the word verification based on this result when the recognition target is a num
LandOfFree
Character recognition apparatus does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Character recognition apparatus, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Character recognition apparatus will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3201460