System and methods for determining semantic similarity of...

Data processing: speech signal processing – linguistics – language – Linguistics – Natural language

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

06810376

ABSTRACT:

COPYRIGHT NOTICE
A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
BACKGROUND OF THE INVENTION
The present invention relates to automatic processing of human language. More especially, the present invention relates to systems and methods for automatically determining the semantic similarity of different sentences to one another.
Consider a sentence-matching machine that can accept an input sentence of human language (for example, a sentence of Chinese-language text) and then automatically find one or more sentences, from a database of sentences, that most closely match the input sentence in some sense. Such a sentence-matching machine would be useful. A lighthearted example of a desired use for a sentence-matching machine is as follows. A person expresses an interesting thought in a sentence and then inputs that sentence into the sentence-matching machine. The person wishes the machine to compare the sentence with sentences from a book of famous quotations to see if anyone famous has previously expressed the same or similar thought before in a sentence quoted in the book.
Certain types of sentence-matching machines do exist. Unfortunately, the conventional sentence-matching machine is not well suited to perform the task sought by the person in the above-mentioned example. The reason is that different people, or even the same person on different occasions, are likely to express any given idea using different sets of words. The conventional sentence-matching machine, would not be good at recognizing semantically similar sentences if sufficiently non-overlapping sets of words are used in the sentences.
What is needed is a sentence-matching system and associated methods that can make good judgments of semantic similarity among word sets, e.g., sentences, even when word sets that are semantically similar do not necessarily have words in common. What is especially needed is such a sentence-matching system and associated methods that are operative on word sets that include words of Chinese, or similar languages. The present invention satisfies these and other needs.
SUMMARY OF THE INVENTION
A system and associated methods determine the semantic similarity of different sentences to one another. A particularly appropriate application of the present invention is to automatic processing of Chinese-language text, for example, for document retrieval. According to one aspect of the invention, a method for computing the similarity between a first and a second set of words includes identifying a word of the second set of words as being most similar to a word of the first set of words, wherein the word of the second set of words need not be identical to the word of the first set of words; and computing a score of the similarity between the first and second set of words based at least in part on the word of the second set of words. According to another aspect of the invention, a system for computing the similarity between a first set of words and a second set of words includes means for identifying a word of the second set of words as being most similar to a word of the first set of words, wherein the word of the second set of words need not be identical to the word of the first set of words; and means for computing a score of the similarity between the first and second set of words based at least in part on the word of the second set of words.


REFERENCES:
patent: 5297039 (1994-03-01), Kanaegami et al.
patent: 5619709 (1997-04-01), Caid et al.
patent: 5675819 (1997-10-01), Schuetze
patent: 5680511 (1997-10-01), Baker et al.
patent: 5873056 (1999-02-01), Liddy et al.
patent: 6137911 (2000-10-01), Zhilyaev
patent: 6260008 (2001-07-01), Sanfilippo
patent: 6453315 (2002-09-01), Weissman et al.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and methods for determining semantic similarity of... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System and methods for determining semantic similarity of..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and methods for determining semantic similarity of... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3285330

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.