Method for extracting name entities and jargon terms using a...

Data processing: speech signal processing – linguistics – language – Linguistics – Natural language

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S001000

Reexamination Certificate

active

10017408

ABSTRACT:
A method for entity name and jargon term recognition and extraction. An embodiment of the present invention uses a suffix tree data structure to determine frequently occurring phrases. In one embodiment text to be analyzed is preprocessed. The text is then separated into clauses and a suffix tree is created for the text. The suffix tree is used to determine repetitious segments. Unrecognized text fragment, occurring with a high frequency, have a comparably high probability of being a name entity or jargon term. The set of repetitious segments is then filtered to obtain a set of possible entity names and jargon terms.

REFERENCES:
patent: 5384703 (1995-01-01), Withgott et al.
patent: 5638543 (1997-06-01), Pedersen et al.
patent: 6098034 (2000-08-01), Razin et al.
patent: 7020587 (2006-03-01), Di et al.
patent: 2003/0014448 (2003-01-01), Castellanos et al.
Chien, Lee-Feng. “PAT-Tree-Based Keyword Extraction for Chinese Information Retrieval”, Annual ACM Conference on Research and Development in Information Retrieval, 1997, pp. 50-58.
Yamamoto, M. Church, K. “Using suffix arrays to compute term frequency and document frequency for all substrings in a corpus” Computational Linguistics vol. 27, issue 1, Mar. 2001, pp. 1-30.
Jagadish, H. Ng, R. Srivastava, D. “Substring selectivity estimation” Symposium on Principles of Database Systems pp. 249-260, 1999.
Mark Nelson, Fast String Searching With Suffix Trees, Dr. Dobb's Journal, Aug. 1996.
Hsin-Hsi Chen, et al., Description Of The NTU System Used For MET2, National Taiwan University, Taipei, Taiwan.
Walter Daelemans, et al., Rapid Development Of NLP Modules With Memory-Based Learning, ILK Computational Linguistics, Tilburg University, Tilburg, The Netherlands.
Walter Daelemans, et al., TiMBL: Tilburg Memory-Based Learner, version 5.1, Reference Guide, ILK Technical Report—ILK 04-02, Tilburg University, Dec. 31, 2004, Tilburg, The Netherlands.
Shihong Yu, et al., Description of The Kent Ridge Digital Labs System Used For MUC-7, Kent Ridge Digital Labs, Singapore.
Esko Ukkonen, On-line Construction of suffix trees, Algorithmica, University of Helinski, Finland.
Esko Ukkonen, Constructing Suffix Trees On-line In Linear Time, University of Helinski, Helinski, Finland.
Edward M. McCreight, A Space-Economical Suffix Tree Construction Algorithm, Journal of the Association for Computing Machinery, vol. 23, No. 2, Apr. 1976, pp. 262-272.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method for extracting name entities and jargon terms using a... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method for extracting name entities and jargon terms using a..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for extracting name entities and jargon terms using a... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3750349

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.