Format recognition method, apparatus and storage medium

Image analysis – Image segmentation – Separating document regions using preprinted guides or markings

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C382S176000, C382S177000, C382S306000

Reexamination Certificate

active

06567545

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a format recognition method, an apparatus and a storage medium thereof for recognizing the format of a form in order to identify characters on the form, and in particular to a format recognition method, an apparatus and a storage medium for automatically analyzing the structure of a chart on a form and for determining the attributes of individual entries constituting the chart.
2. Related Arts
A character recognition method for identifying characters on a form is employed for the automatic input of data. According to this method, an image at a designated position on the form is obtained for character recognition.
To identify characters on the form, the attribute of the character, such as the data name (field identifier) of a character and a character type, must be defined.
FIG. 87
is a diagram for explaining the prior art.
In
FIG. 87
, shown as an example form is a money transfer application. The money transfer application includes entries for a “Transfer destination,” a “Bank name” and a “Branch office name.” In order for these entries to be automatically recognized, the attributes of each entry, such as a field ID (data name), the start position, the end position, the number of digits and a recognition category (character type), must be defined. Conventionally, for registration, an operator enters this definition information for each form.
However, in the prior art, the definition information must be registered in advance to reading, so that character recognition is available only for a form for which the definition information has previously been registered. And for the automatic input of money transfer information at a financial facility, for example, various formats are employed for a money transfer application used by a company. Therefore, a great deal of labor is required to prepare definition information for each form in advance.
Further, even when the definition information for a form has been registered, it must be changed if the format of the form is altered.
SUMMARY OF THE INVENTION
It is, therefore, one objective of the present invention to provide a format recognition method, an apparatus and a storage medium for automatically identifying definition information provided as individual entries on a form.
It is another objective of the present invention to provide a format recognition method, an apparatus and a storage medium for determining the attributes of the smallest rectangles on a form, and for identifying data recorded on the form.
To achieve the above objectives, according to one aspect of the present invention, a format recognition method for identifying the structure of a chart on a form comprises the steps of:
extracting vertical ruled lines and horizontal ruled lines on the form, and the smallest rectangles formed by the ruled lines employing an image of the form;
analyzing the chart structure of the form employing the physical arrangement of the smallest rectangles; and
determining attributes for the smallest rectangles employing the resultant chart structure.
According to this invention, the smallest rectangles for the chart structure are extracted and the physical arrangement of the smallest rectangles is detected. The chart structure is then analyzed by using the physical arrangement of the smallest rectangles, and the attributes of the smallest rectangles are determined. Since the attributes for the smallest rectangles are determined by detecting their physical arrangements, the analyzation of the structure of the chart on the form can be performed automatically.
Therefore, the registration in advance of individual entries on a form is not required. Further, even when the format of the form is changed, the definition information need not be manually altered.
According to one more aspect of the present invention, the step of analyzing the chart structure includes the steps of:
extracting a relationship existing among the smallest rectangles in a row employing the positional relationship of the smallest rectangles; and
extracting a relationship existing among the smallest rectangles in a column employing the positional relationship of the smallest rectangles.
According to another aspect of the present invention, the step of extracting the relationships existing among the smallest rectangles in a row includes the steps of:
employing connection relationships to sort the smallest rectangles into individual rows, each of which is constituted by the smallest rectangles which share a connection relationship; and
collecting into blocks sequential rows having the same row structure. The step of extracting the relationship in a row includes a step of extracting a relationship existing among vertically connected blocks having the same row structure.
According to an additional aspect of the present invention, the step of determining the attribute includes a step of determining, as a data portion, a block having the maximum number of rows, and of determining, as a headline portion, blocks positioned above and under the block.
According to a further aspect of the present invention, the step of determining the attributes includes the steps of:
determining the attributes of the headline portion by identifying characters in the headline portions; and
determining the attributes for the data portion by using the attributes for the headline portion.
According to one further aspect of the present invention, the step of analyzing the chart structure includes the steps of:
employing the connection relationships existing among the smallest rectangles to sort into groups the smallest rectangles which are connected;
sorting the groups for the individual elements of a chart; and
analyzing the relationships existing among rows and columns of the smallest rectangles for the individual elements in the chart.
According to yet one more aspect of the present invention, the step of analyzing the relationships existing along rows and columns of the smallest rectangles includes the steps of:
analyzing the smallest rectangles for relationships existing along the rows; and
analyzing the smallest rectangles for relationships existing along the columns.
According to yet another aspect of the present invention, the step of analyzing the smallest rectangles for the relationships existing along the rows includes a step of:
analyzing the nested structure of a row constituted by the smallest rectangles, and storing nesting information for the smallest rectangles.
According to yet an additional aspect of the present invention, the step of analyzing the chart structure includes the steps of:
extracting chart structures for the individual elements; and
combining elements having the same chart structure.
According to yet a further aspect of the present invention, the step of analyzing the chart structure includes steps of:
extracting the chart structures for the elements; and
employing the chart structures for the elements to recover ruled lines in the elements.
According to yet one further aspect of the present invention, the step of analyzing the relationships existing along the rows includes the steps of:
employing the structures of individual rows to detect all strikeout lines entered in rows; and
deleting the strikeout lines to determine the relationships existing along the rows.
According to still one more aspect of the present invention, the step of analyzing the chart structure includes steps of:
detecting whether the smallest rectangles are contiguous; and
combining the contiguous, smallest rectangles to form a single rectangle.


REFERENCES:
patent: 5101448 (1992-03-01), Kawachiya et al.
patent: 5854854 (1998-12-01), Cullen et al.
patent: 6043823 (2000-03-01), Kodaira et al.
patent: 6125204 (2000-09-01), Nakatsuka et al.
patent: 6289120 (2001-09-01), Yamaai et al.
patent: 6320983 (2001-11-01), Matsuno et al.
patent: 7-282193 (1995-10-01), None
patent: 10-011531 (1998-01-01), None
patent: 10-040333 (1998-02-01), None
patent: 10-049602 (1998-02-01), None
Ser. No.: 08/809,594; Filed: Mar. 31, 1997; By:

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Format recognition method, apparatus and storage medium does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Format recognition method, apparatus and storage medium, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Format recognition method, apparatus and storage medium will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3016672

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.