Determining similarity between words

Data processing: speech signal processing – linguistics – language – Linguistics

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

704 9, G06F 1727

Patent

active

060980338

ABSTRACT:
The present invention provides a facility for determining similarity between two input words utilizing the frequencies with which path patterns occurring between the words occur between words known to be synonyms. A preferred embodiment of the facility utilizes a training phase and a similarity determination phase. In the training phase, the facility first identifies, for a number of pairs of synonyms, the most salient semantic relation paths between each pair of synonyms. The facility then extracts from these semantic relation paths their path patterns, which each comprise a series of directional relation types. The number of times that each path pattern occurs between pairs of synonyms, called the frequency of the path pattern, is counted. In the training phase, the facility identifies the most salient semantic relation paths between the input words, and extracts their path patterns. The facility then averages the frequencies counted in the training phase for the path patterns extracted for the input words in order to obtain a quantitative measure of the similarity between the input words.

REFERENCES:
patent: 5325298 (1994-06-01), Gallant
patent: 5424947 (1995-06-01), Nagao et al.
patent: 5675819 (1997-10-01), Schuetze
patent: 5724594 (1998-03-01), Pentheroudakis
Dagan et al., "Similarity-Based Estimation of Word Coocurrence Probabilities," in Proceedings of the 32nd Annual Meeting of the ACL, 1994, pp. 272-278.
Dagan et al.,"Contextual Word Similarity nad Etimation From Sparse Data,"in Proceedings of the 31st Annual Meeting of the the Assoc. for Computational Linguistics, Columbus, OH, Jun. 2-26, 19993, pp.164-171.
Eesnik, Philip, "Disambiguating Noun Groups With Respect to WordNet Senses,"in Proceedings of the Third Worshop on Very Large Corpora, MA, Jun. 31, 1995, pp.1-16.
Salton, Gerard, and Michael J. McGill, Introduction to Modern Information Retrieval, McGraw-Hill Publishing Co., New York, NY, 1983, entire book.
Sadler, Victor, Working With Analogical Semantics: Disambiguation Techniques in DLT, Foris Publications, Dordrecht, Holland, 1989, entire book.
Wilks et al., "Providing Machine Tractable Dictionary Tools," Machine Translations 5:99-154, 1990.
Hindle, Donald, "Noun Classification From Predicate-Argument Structures," in Proceedings of the 28.sup.th Annual Meeting of the ACL, Pittsburgh, PA, Jun. 6-9, 1990, pp. 268-275.
Sato, Satoshi, "Example-Based Machine Translation," in Proceedings of the International Workshop on Fundamental Research for the Future Generation of Natural Language Processing, Kyoto, Japan, Sep. 1991, pp. 1-16.
Sumita, Eiichiro, and Hitoshi Iida, "Experiments and Prospects of Example-Based Machine Translation," in Proceedings of the 29.sup.th Annual Meeting of the ACL, 1991, pp. 185-192.
Hearst, Marti A., and Gregory Grefenstette, "Refining Automatically-Discovered Lexical Relations: Combining Weak Techniques for Stronger Results," in Papers From the 1992 AAAI Workshop, Menlo Park, CA, 1992, pp. 64-72.
Furuse, Osamu, and Hitoshi Iida, "An Example-Based Method for Transfer-Driven Machine Translation," in Proc. of the 4.sup.th International Conference on Theoretical and Methodological Issues in Machine Translation, Montreal, Quebec, Canada, 1992, pp. 139-150.
Yarowsky, David, "Word-Sense Disambiguation Using Statistical Models of Roget's Categories Trained on Large Corpora," in Proceedings of the 15.sup.th Int'l. Conference on Computational Linguistics, Nantes, France, Aug. 23-28, 1992, pp. 454-460.
Brown et al., "Class-Based n-gram Models of Natural Language," Computational Linguistics 18(4):467-479, Dec. 1992.
Tsutsumi, Taijiro, Natural Language Processing: The PLNLP Approach, Kluwer Academic Publishers, Boston, MA, 1993, Chap. 20, "Word-Sense Disambiguation by Examples," pp. 263-272.
Pereira et al., "Distributional Clustering of English Words," in Proceedings of the 31st Annual Meeting of the Assoc. for Computational Linguistics, Columbus, OH, Jun. 22-26, 1993, pp. 183-190.
Kozima, Hideki, and Teiji Furugori, "Similarity Between Words Computed by Spreading Activation on an English Dictionary," in Proceedings of the 6.sup.th Conference of the European Chapter of the ACL, Utrecht, Germany, 1993, pp. 232-240.
Braden-Harder, Lisa, Natural Language Processing: The PLNLP Approach, Kluwer Academic Publishers, Boston, MA, 1993, Chap. 19, "Sense Disambiguation Using Online Dictionaries," pp. 247-261.
Utsuro et al., "Thesaurus-Based Efficient Example Retrieval by Generating Retrieval Queries From Similarities," in Proceedings of the 15.sup.th International Conference on Computational Linguistics, Kyoto, Japan, Aug. 5-9, 1994, pp. 1044-1048.
Grishman, Ralph, and John Sterling, "Generalizing Automatically Generated Selectional Patterns," in Proceedings of the 15.sup.th International Conference on Computational Linguistics, Kyoto, Japan, Aug. 5-9, 1994, pp. 742-747.
Uramoto, Naohiko, "A Best-Match Algorithm for Broad-Coverage Example-Based Disambiguation," in Proceedings of the 15.sup.th International Conference on Computational Linguistics, Kyoto, Japan, Aug. 5-9, 1994, pp. 717-721.
Resnik, Philip, "Disambiguating Noun Groupings With Respect to WordNet Senses," in Proceedings of the 3.sup.rd Workshop on Very Large Corpora, Boston, MA, Jun. 30, 1995, pp. 1-16.
Agirre, Eneko, and German Rigau, "Word Sense Disambiguation Using Conceptual Density," in Proceedings of COLING 96, 1996, pp. 16-23.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Determining similarity between words does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Determining similarity between words, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Determining similarity between words will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-673247

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.