Data processing: structural design – modeling – simulation – and em – Modeling by mathematical expression
Reexamination Certificate
2006-09-26
2006-09-26
Phan, Thai (Department: 2128)
Data processing: structural design, modeling, simulation, and em
Modeling by mathematical expression
C704S009000, C706S045000, C707S793000, C707S793000
Reexamination Certificate
active
07113897
ABSTRACT:
The invention provides a text segmentation apparatus comprising means for analyzing an electronic text to determine likelihood of segmentation point for each of sentence ends in the text based on a coherent unit and means for segmenting the text into text segments based on the likelihood of segmentation point. The apparatus is programmed to segment the text segment at the position having the best likelihood of segmentation point within the text segment when the size of any of the segmented text segments exceeds a threshold value to be determined based on the specified text segmentation size. Particularly, the apparatus determines the similarity between the text parts contained in a pair of windows to be set up on the left and right sides of each sentence end position in the text so as to obtain similarity curves. Then, the apparatus determines the likelihood of segmentation point for each sentence end point based on the obtained similarity curves. The apparatus segments the text at the point having the best likelihood of segmentation point and further segments it at the point of the second best likelihood of segmentation point, and so on, until the size of all of the text segments becomes approximately equal to the specified segment size.
REFERENCES:
patent: 5577249 (1996-11-01), Califano
patent: 5761191 (1998-06-01), VanDervort et al.
patent: 6185524 (2001-02-01), Carus et al.
patent: 6317708 (2001-11-01), Witbrock et al.
patent: 6411962 (2002-06-01), Kupiec
patent: 6611825 (2003-08-01), Billheimer et al.
patent: 6675174 (2004-01-01), Bolle et al.
patent: 2004/0078188 (2004-04-01), Gibbon et al.
patent: 11-235574 (1999-08-01), None
patent: 11-242684 (1999-09-01), None
patent: 2000-235574 (2000-08-01), None
Yoshio Nakao, “Thematic Hierarchy detection of a text using lexical cohesion”: pp. 83-112, (Abstract).
Mochizuki et al., “Passage-Level Document Retrieval Using Lexical Chains”: pp. 101-126, (Abs).
Tamura et al., “Text Structuring by Composition and Decomposition of Segments”: pp. 59-78.
Mochidzuki et al., “Text Segmentation Used Combining Multiple Knowledge Sources”, (Abstract).
Nishizawa et al., “Bottom-Up Discourse Segmentation Based on Word Frequency”: pp. 145-152.
Hirao et al., “Text Segmentation Based on Word Importance and Lexical Cohesion”: pp. 41-48.
Jeffrey C. Reynar, “An Automatic Method of Finding Topic Boundaries”.
Litman et al., “Combining Multiple Knowledge Sources for Discourse Segmentation”.
Hearst et al., “Subtopic Structuring for Full-Lenght Document Access”.
Marti A. Hearst, “Multi-Paragraph Segmentation of Expository Text”.
Marti A. Hearst, “Texttiling:Segmenting Text Into Multi-Paragraph Subtopic Passages”: pp. 34-64.
Marti A. Hearst, “Texttiling: A Quantitative Approach to Discourse Segmentation”: pp. 1-10.
Salton et al., “Automatic Text Decomposition Using Text Segments and Text Themes”.
Mitra et al., “Automatic Text Summarization by Paragraph Extraction”: pp. 1-11.
Nakagawa Shinya
Shimizu Hiroyuki
Hewlett--Packard Company
Phan Thai
LandOfFree
Apparatus and method for text segmentation based on coherent... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Apparatus and method for text segmentation based on coherent..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Apparatus and method for text segmentation based on coherent... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3586241