Automatic method of generating thematic summaries from a documen

Image analysis – Pattern recognition – Context analysis or word recognition

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

382170, 382180, 382199, 382206, 704 1, G06K 972, G06K 934, G06K 948, G06K 952

Patent

active

058481915

ABSTRACT:
A method of automatically generating a thematic summary from a document image without performing character recognition to generate an ASCII representation of the document text. The method begins with decomposition of the document image into text blocks, and text lines. Using the median x-height of text blocks the main body of text is identified. Afterward, word image equivalence classes and sentence boundaries within the blocks of the main body of text are determined. The word image equivalence classes are used to identify thematic words. These, in turn are used to score the sentences within the main body of text, and the highest scoring sentences are selected for extraction.

REFERENCES:
patent: 3930237 (1975-12-01), Villers
patent: 4194221 (1980-03-01), Stoffel
patent: 4610025 (1986-09-01), Blum et al.
patent: 4741045 (1988-04-01), Denning
patent: 4907283 (1990-03-01), Tanaka et al.
patent: 4965763 (1990-10-01), Zamora
patent: 5077668 (1991-12-01), Doi
patent: 5131049 (1992-07-01), Bloomberg et al.
patent: 5181255 (1993-01-01), Bloomberg
patent: 5202933 (1993-04-01), Bloomberg
patent: 5257186 (1993-10-01), Ukita et al.
patent: 5297027 (1994-03-01), Morimoto et al.
patent: 5315671 (1994-05-01), Higuchi
patent: 5321770 (1994-06-01), Huttenlocher
patent: 5325444 (1994-06-01), Cass et al.
patent: 5384864 (1995-01-01), Spitz
patent: 5390259 (1995-02-01), Withgott et al.
patent: 5396566 (1995-03-01), Bruce et al.
patent: 5410611 (1995-04-01), Huttenlocher et al.
patent: 5410612 (1995-04-01), Arai et al.
patent: 5442715 (1995-08-01), Gaborski et al.
patent: 5444797 (1995-08-01), Spitz et al.
patent: 5488719 (1996-01-01), Kaplan et al.
patent: 5491760 (1996-02-01), Withgott et al.
patent: 5495349 (1996-02-01), Ikeda
patent: 5526443 (1996-06-01), Nakayama
patent: 5544259 (1996-08-01), McCubbrey
patent: 5550934 (1996-08-01), Van Vliembergen et al.
patent: 5638543 (1997-06-01), Pedersen et al.
Cheong, Tong L. and Tan S. Lip. "A Statistical Approach to Automatic Text Extraction," Institute of Systems Science; Asian Library Journal, pp. 1-8.
Jones, Richard L. "AIDA the Artificially Intelligent Document Analyzer," McDonald, C., Weckert, J. ed., Proceedings of a Conference and Workshop on Libraries and Expert Systems, Riverina, Austrailia, Jul. 1990, pp. 49-57.
Jones, Richard L. and Dan Corbett. "Automatic Document Content Analysis: The AIDA Project," Library Hi Tech, vol. 10:1-2(1992), issue 37-38, pp. 111-117.
Luhn, H. P. "The Automatic Creation of Literature Abstracts," IBM Journal of Research and Development, vol. 2: No. 2, Apr., 1958, pp. 159-162.
Luhn, H. P. "A Business Intelligence System," IBM Journal of Research and Development, vol. 2: No. 4, Oct. 1958, pp. 314-319.
Paice, Chris D.and Paul A. Jones. "The Identification of Important Concepts in Highly Structured Technical Papers," Proceedings of the Sixteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Pittsburgh, PA, Jun. 27-Jul. 1, 1993, pp. 69-78.
European Search Report for EPO counterpart Application No. 96308997.4. 27 Nov. 1997.
Bloomberg, Dan S. and Luc Vincent. "Blur Hit-Miss Transform and Its Use in Document Image Pattern Detection," Proceedings SPIE Conference 2422, Document Recognition II, San Jose, CA, Feb. 6-7, 1995, pp. 278-292.
Bloomberg, Dan S. et al. "Measuring Document Image Skew and Orientation," Proceedings SPIE Conference 2422, Document Recognition II, San Jose, CA, Feb. 6-7, 1995, pp. 302-316.
Bloomberg, Dan S. "Multiresolution Morphological Analysis of Document Images," Proceedings SPIE Conference 1818, Visual Communications and Image Processing '92, Boston, MA, Nov. 18-20, 1992. pp. 648-662.
Chen, Francine R. et al. "Spotting Phrases in Lines of Imaged Text," Proceedings SPIE Conference 2422, Document Recognition II, San Jose, CA, Feb. 6-7, 1995, pp. 256-269.
Chen, Francine R. and Margaret Withgott. "The Use of Emphasis to Automatically Summarize a Spoken Discourse," Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 1, San Francisco, CA, Mar. 23-26, 1992, pp. 229-232.
Jones, Karen S. and Brigitte Endres-Niggemeyer. "Automatic Summarizing," Information Processing & Management, vol. 31, No. 5, pp. 625-630, 1995.
Jones, Karen S. "What Might Be in a Summary?," Information Retreiveal 93: Von der Modellierung zur Anwendung' (ed. Knorz, Krause and Womser-Hacker), Universitatsverlag Konstanz, 1993, 9-26.
Kupiec, Julian et al. "A Trainable Document Summarizer," Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Seattle, WA, Jul. 9-13, 1995, pp. 68-73.
Luhn, H. P. "The Automatic Creation of Literature Abstracts," IBM Journal of Research and Development, vol. 2, No. 2, Apr., 1958, pp. 159-165.
Paice, Chris D. "Constructing Literature Abstract by Computer: Techniques and Prospects," Information Processing & Management, vol. 26, No. 1, pp. 171-186, 1990.
Rath, G. J. et al. "The Formation of Abstracts by the Selection of Sentences: Part I. Sentence Selection by Men and Machines," American Documentation, Apr., 1961, pp. 139-143.
Salton, Gerard et al. "Automatic Analysis, Theme Genreation, and Summarization of Machine-Readable Texts," Science, vol. 264, Jun. 3, 1994, pp. 1421-1426.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Automatic method of generating thematic summaries from a documen does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Automatic method of generating thematic summaries from a documen, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Automatic method of generating thematic summaries from a documen will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-186912

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.