Method and apparatus for recognizing a character

Image analysis – Pattern recognition – Context analysis or word recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C382S229000

Reexamination Certificate

active

06212299

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to method and apparatus for recognizing a character written in a document image.
2. Description of the Related Art
In recent years, demands have been increasing on used of an apparatus for recognizing characters as an input unit of computer. Especially, an apparatus for quickly and accurately recognizing characters is indispensable for improvement of computer performance.
2.1. Previously Proposed Art
A conventional apparatus for recognizing characters is described with reference to FIG.
5
.
FIG. 5
shows an example of a binary document image obtained from an image scanner (not shown) for reading a document in which a plurality of characters are written.
A document, in which a plurality of characters are written or printed, is read by the image scanner as a binary document image, and the binary document image read by the image scanner is stored in an image storing unit. The binary document image is composed of a plurality of pieces of pixel data consisting of white and black pixels and position data of the pixels in X-Y co-ordinates.
In the specification and drawings, successive black and white pixels are structured by connecting a plurality of black and white pixels, respectively. That is, each character is represented by one or more successive black pixel masses of black pixels.
Therefore, a black region in the document is represented by the successive black pixels. In other words, the black region is defined as a region where a region of the characters is excepted from all of the document region.
Also, a character rectangle circumscribed about successive black pixels is virtually obtained by a circumscribed rectangular detecting unit.
Prior to the recognition of the binary document image, in cases where a plurality of character rectangles are located within a predetermined distance from each other, this conventional recognition apparatus unifies the character rectangles to form a unified character rectangle, and the unified character rectangle is regarded as a single character rectangle. Thereafter, the conventional apparatus recognizes one mass of successive black pixels in the character rectangle and one or more masses of successive black pixels in the unified character rectangle as one character, respectively.
Therefore, since a non-separating character such as “a”, “b”, “c”, “d”, “e”, “f”, “g”, “h” or the like is structured by a single mass of successive black pixels connected with each other, the conventional apparatus recognizes the non-separating character without the above unification of a plurality of the character rectangles.
On the other hand, since a separating character such as “i”, “j” or the like is structured by a plurality of masses of the successive black pixels, the conventional apparatus recognizes the separating character by the above unification.
Concretely, as shown in
FIG. 5
, since character rectangles C
12
and C
13
are located within a predetermined distance from each other, these character rectangles are unified together and the conventional apparatus recognizes masses of successive black pixels in the character rectangles C
12
and C
13
as a single character “i”.
2.2. Problems to be Solved by the Invention
However, as shown in
FIG. 5
, in a case of that such a noise as “,” in a character rectangle C
16
exists or occurs in the document or the document image read by the scanner, respectively, the conventional apparatus unifies a character rectangle C
15
and a character rectangle C
16
. As a result, a mass of successive black pixels in the character rectangle C
15
is not recognized as a character “e”.
Furthermore, it is well-known that such a noise as “,” often occurs from several kinds of causes.
Therefore, there is a drawback that a character written in the document is not reliably recognized.
SUMMARY OF THE INVENTION
An object of the present invention is to provide, with due consideration to the drawbacks of such a conventional method and a conventional apparatus for recognizing a character, method and apparatus in which a character is accurately and reliably recognized even though a noise exists in a position close to a character.
The object is achieved by the provision of an apparatus for recognizing a character written in a document, comprising:
image reading means for reading an image of the document to obtain a document image indicated by a plurality of black pixels and a plurality of white pixels;
character rectangle producing means for extracting a plurality of black-pixel masses, respectively composed of a plurality of black pixels connected with each other, from the document image obtained by the image reading means and producing a plurality of character rectangles respectively circumscribed about one black-pixel mass;
character pattern classifying means for comparing character images of the black-pixel masses, about which the character rectangles produced by the character rectangle producing means are circumscribed, with each other, and classifying one or more black-pixel masses, of which the character images have the same character pattern, into a character group for each character pattern to classify each of the black-pixel masses extracted by the character rectangle producing means into one of the character patterns;
representative character image determining means for determining a representative character image of a representative black-pixel mass representing the character images of the black-pixel masses classified into the same character group by the character pattern classifying means, for each of the character groups;
figure feature detecting means for detecting a figure feature of one representative character image of one representative black-pixel mass determined by the representative pattern determining means, for each of the representative character images;
referential figure feature storing means for storing a plurality of referential figure features of a plurality of referential character patterns which each express a character;
character recognizing means for comparing one figure feature of one representative character image detected by the figure feature detecting means with each of the referential figure features stored in the referential figure feature storing means for each of the figure features of the representative character images, detecting a particular referential character pattern as a character pattern agreeing with one representative character image for each of the representative character images in cases where a particular referential figure feature of the particular referential character pattern agrees with the figure feature of the representative character image and recognizing each of the character images of the black-pixel masses classified into the character group, which corresponds to the representative character image agreeing with the particular referential character pattern, as a character expressed by the particular referential character pattern;
character rectangle unifying means for selecting a first character rectangle and a second character rectangle from a group of the character rectangles reproduced by the character rectangle producing means, on condition that either a first character image of a first black-pixel mass about which the first character rectangle is circumscribed or a second character image of a second black-pixel mass about which the second character rectangle is circumscribed is not recognized as any character by the character recognizing means and the first and second character rectangles are close to each other within a predetermined character distance, unifying the first and second character rectangles to a unified character rectangle circumscribed about the first and second black-pixel masses while maintaining positions of the first and second character rectangles composing the unified character rectangle, deleting the first and second character rectangles from the group of the character rectangles reproduced by the character rectangle producing means, and adding the uni

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for recognizing a character does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus for recognizing a character, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for recognizing a character will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2435757

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.