Character recognition apparatus and method for recognizing...

Image analysis – Image segmentation

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C382S174000, C382S177000, C382S171000, C382S200000, C382S209000

Reexamination Certificate

active

06327384

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to an apparatus for coding a binary stationary image, and more specifically to an apparatus for one which extracts a pattern from a binary image in a system for pattern matching coding.
2. Description of Related Art
In the past, a method for coding of a binary stationary image using pattern matching has been known, this coding method being one in which an image is divided into patterns which are collections of continuously arranged black pixels, with matching being performed with respect to each pattern.
Then, in accordance with the results of this pattern matching, the bit maps for the patterns themselves and information which represents the pattern positions and sizes are coded.
In a binary stationary image coding method which uses pattern matching as described above, when extracting a pattern from an image, black pixels are detected by scanning along the image from the upper left part to the lower right part thereof.
Next, the outline contour of the collection of continuously arranged black pixels, is traced with a detected black pixel as the starting point to determine the contour of the pattern. Finally, the contents of this contour is extracted as the pattern.
When performing the above operations, absolute coordinates on the first appearing pattern are used as the reference in indicating the position of a pattern, with other patterns basically being expressed as a related distance (offset) from the immediately previously appearing pattern.
Considering the above-noted point, in horizontally written text, the offset values will be small, making the coding efficiency improved than the base in which all patterns are expressed in absolute coordinates.
It occurs that, in the binary stationary image coding method of the past, when extracting a pattern (for example, when extracting a pattern from Japanese-language text or Chinese language text), because of, for example, the characteristic of Japanese that it is made up of many kanji ideographic characters that are complex and that have a large number of strokes, there are cases in which a single character will have a plurality of patterns.
For example, with regard to the character “KAN” as shown in
FIG. 5
, as used in the word kanji itself, there are three patterns (
1
), (
2
) and (
3
) in the left “radical” part (A) of the character as shown in FIG.
5
and one part (
4
) in the right “tsukuri” part (B) of the character, as shown in
FIG. 5
, making a total of four patterns, resulting in a very large total number of patterns.
As a result, there are excess of data resulting from the need to express the pattern positions, sizes, and the results of pattern matching. This leads to the problem of reduced efficiency in coding.
Additionally, in the binary stationary image coding method of the past, when extracting a pattern, if noise is mixed in with the image so that even for one and same character the associated shape can be slightly different, even for one and the same character, there can be differences in the number of divided patterns and/or the shapes of the patterns.
For this reason, there is an increase in the number of types of patterns, this leading to the problem of a reduction in coding efficiency.
Another problem is that, when extracting a pattern as done in the past, because the offset is the distance to the immediately previously appearing pattern, that is, because the offset is expressed as the distance to the pattern positioned to the left of the pattern of interest, because of the scanning direction, for vertically written text, the spaces between lines are redundant. This leads to the problem of a reduction in coding efficiency.
An object of the present invention is to provide a pattern extraction apparatus that, in extracting a pattern using binary stationary image coding that uses pattern matching, is capable of reducing the number of patterns and number of pattern types, thereby improving the coding efficiency.
Another object of the present invention, is to provide a pattern extraction apparatus that is capable of efficient determination of offset with regard to vertically written text, similar to that of horizontally written text.
Note that, regarding documents which are written by English, German, French or the like, it is usually written in horizontal direction and thus only a writing direction for the row should be confirmed first. Thereafter such reduction of the number of patterns and improvement of the coding efficiency as mentioned above, are also required.
SUMMARY OF THE INVENTION
To achieve the above-noted objects, the present invention is a pattern extraction apparatus that is used in performing coding of a pattern in accordance with the results of dividing a binary stationary image into patterns and performing pattern matching with respect to each of the divided patterns. This pattern extraction apparatus includes a means for projecting the black pixels in the above-noted binary stationary image and determining a histogram thereof, a document type determining means for determining, in accordance with the above-noted histogram, whether the above-noted binary stationary image is horizontally written text or vertically written text and for outputting the result of this determination, a means for extracting a block from the above-noted image in accordance with the above-noted determination result; and means for extracting a pattern in accordance with the above-noted block, and means for calculating the relative distance between the currently extracted pattern and the immediately previously extracted pattern, as an offset.
For example, the above-noted binary stationary image has both vertical and horizontal directions and the above-noted projection means projects each of the black pixels in the vertical and horizontal directions to generate the above-noted histogram.
Additionally, if the above-noted determination results are that the text is disposed in the horizontal direction, the above-noted block extraction means extracts the above-noted block from the above-noted binary stationary image as a column, but if the above-noted determination results a re that the text is disposed in the vertical direction, the above-noted block extraction means extracts the above-noted block from the above-noted binary stationary image as a row.
It is additionally possible, with respect to the above-noted extracted column or row, for the above-noted projection means to project in a first direction, which is perpendicular with respect to the direction of the above-noted column or row so as to determined a sub-histogram and, in accordance with this sub-histogram, the above-noted block extraction means then divides the above-noted column or row into divided blocks and provides these divided blocks to the above-noted pattern extraction means as the above-noted block.
It is also further possible, with respect to the above-noted block, for the above-noted projection means to project in a second direction, which is perpendicular to the above-noted first direction, and to output a projection result. The above-noted pattern extraction means then extracts the above-noted extracted pattern from the above-noted block based on this projection result. In this case, a joining determining means makes a determination as to whether, with respect to the above-noted extracted pattern, the pattern is extracted as a unit of character for each one of the pattern, this being done in units of characters and output the result of the determination.
Then a joining means, in response to the above-noted determination results, joins patterns together to form a joined pattern as one unit of characters.
This joined pattern is given to the above-noted offset calculation means as the above-noted extracted pattern.
In a pattern extraction apparatus according to the present invention, in the projection means a histogram is generated of black pixels in each of the vertical and horizontal directions in the input binary stationary image.
In accordance with the resulting output

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Character recognition apparatus and method for recognizing... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Character recognition apparatus and method for recognizing..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Character recognition apparatus and method for recognizing... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2588124

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.