Lexical association metric for knowledge-free extraction of...

Data processing: speech signal processing – linguistics – language – Linguistics – Natural language

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S001000, C704S010000

Reexamination Certificate

active

08078452

ABSTRACT:
A method and system for determining a lexical association of phrasal terms are described. A corpus having a plurality of words is received, and a plurality of contexts including one or more context words proximate to a word in the corpus is determined. An occurrence count for each context is determined, and a global rank is assigned based on the occurrence count. Similarly, a number of occurrences of a word being used in a context is determined, and a local rank is assigned to the word-context pair based on the number of occurrences. A rank ratio is then determined for each word-context pair. A rank ratio is equal to the global rank divided by the local rank for a word-context pair. A mutual rank ratio is determined by multiplying the rank ratios corresponding to a phrase. The mutual rank ratio is used to identify phrasal terms in the corpus.

REFERENCES:
patent: 5355311 (1994-10-01), Horioka
patent: 5406480 (1995-04-01), Kanno
patent: 5423032 (1995-06-01), Byrd et al.
patent: 5675819 (1997-10-01), Schuetze
patent: 5867812 (1999-02-01), Sassano
patent: 6081775 (2000-06-01), Dolan
patent: 6101492 (2000-08-01), Jacquemin et al.
patent: 6697793 (2004-02-01), McGreevy
patent: 6859771 (2005-02-01), Xun et al.
patent: 6925433 (2005-08-01), Stensmo
patent: 7031910 (2006-04-01), Eisele
patent: 7197449 (2007-03-01), Hu et al.
patent: 2003/0065501 (2003-04-01), Hamdan
patent: 2003/0083863 (2003-05-01), Ringger et al.
patent: 2003/0236659 (2003-12-01), Castellanos
patent: 2004/0253569 (2004-12-01), Deane et al.
patent: 2005/0049867 (2005-03-01), Deane
Smadja; Retrieving Collocations from Text: Xtract; Computational Linguistics; vol. 19; pp. 143-177; 1993.
Dagan et al.; Termight: Identifying and Translating Technical Terminology; ACM Int'l Conference Proc. Series: Proc. of the 4th Conference on Applied Language Processing; pp. 34-40; 1994.
Justeson et al.; Technical Terminology: Some Linguistic Properties and an Algorithm for Identification in Text; Natural Language Engineering; 1(1); pp. 9-27; 1995.
Daille; Study and Implementation of Combined Techniques for Automatic Extraction of Terminology; in The Balancing Act: Combining Symbolic and Statistical Approaches to Language, Klavans & Resnik (Eds.); pp. 49-66; 1996.
Daille; Study and Implementation of Combined Techniques for Automatic Extraction of Terminology; Talana, University Paris; pp. 29-36; 1996.
Jacquemin et al.; Expansion of Multi-Word Terms for Indexing and Retrieval Using Morphology and Syntax; Proceedings of ACL; pp. 24-31; 1997.
Jacquemin et al.; NLP for Term Variant Extraction: Synergy Between Morphology, Lexicon, and Syntax; Natural Language Information Retrieval; pp. 25-74; 1999.
Boguraev et al.; Applications of Term Identification Technology: Domain Description and Content Characterisation; Natural Language Engineering; 5(1); pp. 17-44; 1999.
Frantzi et al.; Automatic Recognition of Multi-Word Terms: the C-Value and NC-Value Method; Int'l Journal on Digital Libraries; 3(2); pp. 115-130; 2000.
Maynard et al.; Identifying Terms by Their Family and Friends; COLING 2000; pp. 530-536; 2000.
Church et al.; Word Association Norms, Mutual Information, and Lexicography; Computational Linguistics; 16(1); pp. 22-29; 1990.
Dunning; Accurate Methods for the Statistics of Surprise and Coincidence; Computational Linguistics; 19(1); pp. 61-74; 1993.
Zipf; The Psycho-Biology of Language: An Introduction to Dynamic Philology; Houghton-Mifflin, Boston, Massachusetts; 1935.
Zipf; Human Behavior and the Principle of Least Effort; Addison-Wesley, Cambridge, Massagchusetts; 1949.
Ha et al.; Extension of Zipf's Law to Words and Phrases; Proc. of the 19th Int'l Conference on Computational Linguistics; 2002.
Kit; Corpus Tools for Retrieving and Deriving Termhood Evidence; 5th East Asia Conferences on Terminology; 2002.
Rules of Probability; http://web.archive.org/web/20001003132730/http://library.thinkquest.org/11506/prules.html; 2007.
Baayen; Word Frequency Distributions; Kluwer: Dordrecht; 2001.
Choueka; Looking for Needles in a Haystack or Locating Interesting Collocational Expressions in Large Textual Databases; Proceedings of the RIAO; pp. 38-43; 1988.
Dias et al.; Language Independent Automatic Acquisition of Ridid Multiword Units From Unrestricted Text Corpora; TALN; pp. 333-338; 1999.
Evert; The Statistics of Word Cooccurrences: Word Pairs and Collocations; PhD Thesis; Institut fur maschinelle Sprachverarbeitung, University of Stuttgart; 2004.
Evert et al.; Methods for the Qualitative Evaluation of Lexical Association Measures; Proc. of 39th Ann. Mtg. of the Assoc. for Computational Linguistics; pp. 188-195; 2001.
Ferreira Da Silva et al.; A Local Maxima Method and a Fair Dispersion Normalization for Extracting Multi-Word Units From Corpora; 6th Meeting on Mathematics of Language; pp. 369-381; 1999.
Gil et al.; Efficient Mining of Textual Associations; Int'l Conf. on Natural Language Processing and Knowledge Engineering; Chengquing Zong (Eds.); pp. 26-29; 2003.
Gil et al.; Using Masks, Suffix Array-Based Data Structures, and Multidimensional Arrays to Compute Positional N-Gram Statistics from Corpora; In Proc. of the Workshop on Multiword Expressions of the 41st Annual Meeting of the Association of Computational Linguistics; pp. 25-33; 2003.
Johansson; Catching the Cheshire Cat; In Proceedings of COLING 94; vol. II; pp. 1021-1025; 1994.
Johansson; Good Bigrams; In Proceedings from the 16th Int'l Conference on Computational Linguistics (COLING-96); pp. 592-597; 1996.
Krenn; Acquisition of Phraseological Units from Linguistically Interpreted Corpora; A Case Study on German PP-Verb Collocations; Proceedings of ISP-98; pp. 359-371; 1998.
Krenn; Empirical Implications on Lexical Association Measures; Proc. of the 9th EURALEX International Congress; 2000.
Krenn et al.; Can we do better than frequency? A case study on extracting PP-verb collocations; Proceedings of the ACL Workshop on Collocations; pp. 39-46; 2001.
Lin; Extracting Collocations From Text Corpora; First Workshop on Computational Terminology; pp. 57-63; 1998.
Lin; Automatic Identification of Non-computational Phrases; In Proc. of the 37th Annual Meeting of the Assoc. for Computational Linguistics; pp. 317-324; 1999.
Manning et al.; Foundations of Statistical Natural Language Processing; MIT Press, Cambridge, Massachusetts; 1999.
Pantel et al.; A Statistical Corpus-Based Term Extractor; In Lecture Notes in Artificial Intelligence, Stroulia & Matwin (Eds.); Springer-Verlag; pp. 36-46; 2001.
Resnik; Selectional Constraints: An Information-Theoretic Model and Its Computational Realization; Cognition 61; pp. 127-159; 1996.
Schone et al.; Is Knowledge-Free Induction of Multiword Unit Dictionary Headwords a Solved Problem?; Proc. of Empirical Methods in Natural Lang. Processing; pp. 100-108; 2001.
Sekine et al.; Automatic Learning for Semantic Collocation; Proceedings of the 3rd Conference on Applied Natural Language Processing; pp. 104-110; 1992.
Shimohata et al.; Retrieving Collocations by Co-occurrences and Word Order Constraints; Proc. of the 35th Ann. Mtg., Assoc. for Computational Linguistics; pp. 476-481; 1997.
Thanapoulos, et al.; Comparative Evaluation of Collocation Extraction Metrics; Proc. of the LREC 2002 Conference; pp. 609-613; 2002.
Deane; A Nonparametric Method for Extraction of Candidate Phrasal Terms; Proc. of the 43rd Annual Meeting of the ACL; pp. 605-613; 2005.
Lewis et al.; Natural Language Processing for Information Retrieval; Communications of the ACM; vol. 39, No. 1; pp. 92-101; 1996.
Lewis et al.; Natural Language Processing for Information Retrieval; Communications of the ACM; 1993.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Lexical association metric for knowledge-free extraction of... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Lexical association metric for knowledge-free extraction of..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Lexical association metric for knowledge-free extraction of... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-4267248

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.