Data processing: speech signal processing – linguistics – language – Linguistics – Natural language
Reexamination Certificate
2007-03-27
2007-03-27
Hudspeth, David (Department: 2626)
Data processing: speech signal processing, linguistics, language
Linguistics
Natural language
C704S001000
Reexamination Certificate
active
10017408
ABSTRACT:
A method for entity name and jargon term recognition and extraction. An embodiment of the present invention uses a suffix tree data structure to determine frequently occurring phrases. In one embodiment text to be analyzed is preprocessed. The text is then separated into clauses and a suffix tree is created for the text. The suffix tree is used to determine repetitious segments. Unrecognized text fragment, occurring with a high frequency, have a comparably high probability of being a name entity or jargon term. The set of repetitious segments is then filtered to obtain a set of possible entity names and jargon terms.
REFERENCES:
patent: 5384703 (1995-01-01), Withgott et al.
patent: 5638543 (1997-06-01), Pedersen et al.
patent: 6098034 (2000-08-01), Razin et al.
patent: 7020587 (2006-03-01), Di et al.
patent: 2003/0014448 (2003-01-01), Castellanos et al.
Chien, Lee-Feng. “PAT-Tree-Based Keyword Extraction for Chinese Information Retrieval”, Annual ACM Conference on Research and Development in Information Retrieval, 1997, pp. 50-58.
Yamamoto, M. Church, K. “Using suffix arrays to compute term frequency and document frequency for all substrings in a corpus” Computational Linguistics vol. 27, issue 1, Mar. 2001, pp. 1-30.
Jagadish, H. Ng, R. Srivastava, D. “Substring selectivity estimation” Symposium on Principles of Database Systems pp. 249-260, 1999.
Mark Nelson, Fast String Searching With Suffix Trees, Dr. Dobb's Journal, Aug. 1996.
Hsin-Hsi Chen, et al., Description Of The NTU System Used For MET2, National Taiwan University, Taipei, Taiwan.
Walter Daelemans, et al., Rapid Development Of NLP Modules With Memory-Based Learning, ILK Computational Linguistics, Tilburg University, Tilburg, The Netherlands.
Walter Daelemans, et al., TiMBL: Tilburg Memory-Based Learner, version 5.1, Reference Guide, ILK Technical Report—ILK 04-02, Tilburg University, Dec. 31, 2004, Tilburg, The Netherlands.
Shihong Yu, et al., Description of The Kent Ridge Digital Labs System Used For MUC-7, Kent Ridge Digital Labs, Singapore.
Esko Ukkonen, On-line Construction of suffix trees, Algorithmica, University of Helinski, Finland.
Esko Ukkonen, Constructing Suffix Trees On-line In Linear Time, University of Helinski, Helinski, Finland.
Edward M. McCreight, A Space-Economical Suffix Tree Construction Algorithm, Journal of the Association for Computing Machinery, vol. 23, No. 2, Apr. 1976, pp. 262-272.
Hu Zengjian
Zhang Yimin
Zhou Joe F.
Hudspeth David
Intel Corporation
Sked Matthew J
Wu Racheol
LandOfFree
Method for extracting name entities and jargon terms using a... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method for extracting name entities and jargon terms using a..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for extracting name entities and jargon terms using a... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3750349