Method for analyzing structure of a treatise type of...

Image analysis – Image segmentation – Region labeling

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C382S176000

Reexamination Certificate

active

06728403

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to a method for processing a document image; and, more particularly, to a method for analyzing structure of a treatise type of document image in order to detect a title, an author and an abstract region and recognize the content in each of the regions.
DESCRIPTION OF THE PRIOR ART
There are many techniques for processing a document image to construct a database system. One of techniques is a document image structure analysis(see, ChunChen Lin, “Logical Structure Analysis of Book Document Image using Contents Information”, ICDAR 97, Vol. II, pp. 1048-1054, August, 1997). According to the document structure analysis, a character recognition process is performed on a table of contents of a book so that the entire logical structure of the book is analyzed. Since, however, in order to utilize this technique, there must be provided a table of the contents of book, therefore, it is impossible to construct a database system of treatise typed of document image.
In order to construct a database system providing a portion of or an entire of treatises contained at each of journal in a form of document image or a hypertext file format, a table of contents having a title, an author and an abstract information has to be generated.
Hitherto, the table of contents having a title, an author and an abstract information is made by a human being. One reason is that a multi-language recognition is very difficult. Generally, the title and the author are represented on two languages. Another reason is that each position of the title, the author and the abstract is different according to each of the journals. Thus, it is difficult to detect of the position thereof. The other reason is that there is not a distinct difference between the title and the author.
Therefore, it is required to automatically detect title, author and abstract regions and recognize the content in each of the regions so as to make a table of the contents of the treatise in the journals.
SUMMARY OF THE INVENTION
It is, therefore, a primary object of the invention to provide a method for automatically detecting title, author and abstract regions in document image and recognize the content in each of the regions so as to make a table of the contents of the treatise in the journals.
In accordance with the present invention, there is provided a method for analyzing a structure of a treatise type of document image to make a table of contents having a title, an author and an abstract information, comprising the steps of: dividing the document image into a number of regions and classifying the divided regions into text regions and non-text regions according to attributes of the regions; selecting candidate regions representing an abstract and an introduction, extracting word regions from the candidate regions, and determining an abstract content portion; separating the title and the author using the basic form and the type definition representing an arrangement of each of journals; and recognizing the content of the separated regions to generate said table of contents.
In accordance with another aspect of the present invention, there is provided a computer readable media containing the program, the program having functions of: dividing the document image into a number of regions and classifying the divided regions into text regions and non-text regions according to attributes of the regions; selecting candidate regions representing an abstract and an introduction, and finding word regions from the candidate regions to determine the position of an abstract content portion; separating the title and the author using the basic form and the type definition representing an arrangement of each of journals; and recognizing the content of the separated regions to generate said table of contents.
These and other features of the present invention are more fully shown and described in the drawings and detailed description of this invention. It is to be understood, however, that the description and drawings are for the purpose of illustration and should not be read in a manner that would unduly limit the scope of this invention.


REFERENCES:
patent: 4907285 (1990-03-01), Nakano et al.
patent: 5073953 (1991-12-01), Westdijk
patent: 5185813 (1993-02-01), Tsujimoto
patent: 5335290 (1994-08-01), Cullen et al.
patent: 5379373 (1995-01-01), Hayashi et al.
patent: 5434962 (1995-07-01), Kyojima et al.
patent: 5555362 (1996-09-01), Yamashita et al.
patent: 5701500 (1997-12-01), Ikeo et al.
patent: 5848186 (1998-12-01), Wang et al.
patent: 5850490 (1998-12-01), Johnson
patent: 5999664 (1999-12-01), Mahoney et al.
patent: 6233353 (2001-05-01), Danisewicz
patent: 6598046 (2003-07-01), Goldberg et al.
patent: 11203285 (1999-07-01), None
patent: 11203305 (1999-07-01), None
patent: 97-17047 (1996-08-01), None
Lin, et al.;Logical Structure Analysis of Book Document Images Using Contents Information; Apr. 1997; pp. 1048-1054.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method for analyzing structure of a treatise type of... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method for analyzing structure of a treatise type of..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for analyzing structure of a treatise type of... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3210698

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.