Data processing: speech signal processing – linguistics – language – Linguistics – Natural language
Patent
1998-08-14
2000-12-26
Edouard, Patrick N.
Data processing: speech signal processing, linguistics, language
Linguistics
Natural language
707530, G06F 1727, G06F 1500
Patent
active
061673684
ABSTRACT:
A "domain-general" method for representing the "sense" of a document includes the steps of extracting a list of simplex noun phrases representing candidate significant topics in the document, clustering the simplex noun phrases by head, and ranking the simplex noun phrases according to a significance measure to indicate the relative importance of the simplex noun phrases as significant topics of the document. Furthermore, the output can be filtered in a variety of ways, both for automatic processing and for presentation to users.
REFERENCES:
patent: 5519608 (1996-05-01), Kupiec
patent: 5521816 (1996-05-01), Roche et al.
patent: 5580561 (1996-12-01), Church et al.
patent: 5689716 (1997-11-01), Chen
patent: 5708825 (1998-01-01), Sotomayor
patent: 5715468 (1998-02-01), Lucius
patent: 5799269 (1998-08-01), Scabes et al.
patent: 5960383 (1999-09-01), Fleischer
patent: 6026388 (2000-02-01), Liddy et al.
patent: 6061675 (2000-05-01), Wical
H.P. Luhn, "The Automatic Creation of Literature Abstracts", IBM Journal of Research and Development, vol. 2(2), pp. 159-165 (1958).
G. Salton, "Automatic Text Processing: The Transformation, Analysis and Retrieval of Information by Computer", (Addison-Wesley, Reading, MA, 1989).
C.D. Paice, "Constructing Literature Abstracts by Computer: Techniques and Prospects", Information Processing & Management, vol. 26(1), pp. 171-186 (1990).
N. Wacholder, Y.Ravin and M. Choi, "Disambiguation of Proper Names in Text", Proceedings of the Applied Natural Language Processing Conference, pp. 202-208 (Washington, DC, Mar. 1997).
M. Kameyama, "Recognizing Referential Links: An Information Extraction Perspective", Computation Linguistics (Jul. 7, 1997).
B. Boguraev and C. Kennedy, "Technical Terminology for Domain Specification and Document Characterization", Information Extraction: A Multi-Disciplinary Approach to an Emerging Information Technology, pp. 73-96 (Lecture Notes in Computer Science Series, Springer-Verlag, Berlin, 1997).
J.S. Justeson and S.M. Katz, "Technical Terminology: Some Linguistic Properties and an Algorithm for Identification in Text", Natural Language Engineering, vol. 1(1), pp. 9-27 (1995).
T. Strzalkowski, "Building Effective Queries in Natural Language Information Retrieval", Proceedings of the Applied Natural Language Processing Conference, pp. 299-306 (Washington, DC, Mar. 1997).
Lin, Chin-Yew and Eduard Hovy (1997) "Identifying Topics by Position", Proceedings of Fifth Conference on Applied Natural Language Processing, Association of Computational Linguistics, Mar. 21-Apr. 3, 1997.
Boguraev, B. and C. Kennedy (1997), "Salience-Based Content Characterization of Text Documents", In I. Mani and M. Maybury (eds) Intelligent Scalable Text Summarization, Proceedings of Workshop Sponsored by the Association for Computational Linguistics, Madrid, Spain.
Edouard Patrick N.
The Trustees of Columbia University in the City of New York
LandOfFree
Method and system for indentifying significant topics of a docum does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and system for indentifying significant topics of a docum, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for indentifying significant topics of a docum will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-1005661