Natural language speech recognition using slot semantic...

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S275000

Reexamination Certificate

active

06567778

ABSTRACT:

FIELD OF THE INVENTION
This invention relates to the field of interpreting natural language. More particularly, this invention relates to a method and apparatus for processing and interpreting natural language which enhances the operation through the use of semantic confidence values to enhance efficiency.
BACKGROUND OF THE INVENTION
Definitions
The following definitions may be helpful in understanding the background of the invention as it relates to the invention and the discussion outlined below.
Confidence: a measure of a degree of certainty that a system has accurately identified input language. In the preferred embodiment, it is a measure of the degree of perceived acoustic similarity between input speech and an acoustic model of the speech.
Phrase: a sequence of words.
Example: “from Boston”
Grammar rule: a specification of a set of phrases, plus meaning of those phrases
Example: (from [(boston ? massachusetts)(dallas ? texas)]
Generates: “from boston”, “from boston Massachusetts”, “from dallas”, “from dallas texas”
Grammar: a set of grammar rules.
Edge: a match located by a parser of a grammar rule against a phrase contained in an input sentence.
Example: From the sentence “I want to fly from Boston to Dallas,” a parser could create an edge for the phrase “from Boston” using the grammar rule shown above.
Slot: a predetermined unit of information identified by a natural language interpreter from a portion of the natural language input. For example, from the phrase “from Boston” the natural language interpreter might determine that the “origin” slot is to be filled with the value “BOS” (the international airport code for Boston).
Parse tree: a set of edges used in constructing a meaning for an entire sentence.
Example:
Sentence
Subject
I
VerbPhrase
Verb
want
complement
InfVerbPhrase
to
fly
PP
Preposition
from
NP
Noun
Boston
PP
Preposition
to
NP
Noun
Dallas
THE BACKGROUND DISCUSSION
Natural language interpreters are well known and used for a variety of applications. One common use is for an automated telephone system. It will be apparent to those of ordinary skill in the art that these techniques can and have been applied to a variety of other uses. For example, one could use such a system to purchase travel tickets, to arrange hotel reservations, to trade stock, to find a telephone number or extension, among many other useful applications.
As an example, consider a system for use in providing information about commercial passenger air flights. A caller to the system might say “I want to fly from Boston to San Francisco, tomorrow.” This exemplary system requires three pieces of information to provide information about relevant air flights including the origin city, the destination city and the time of travel. Other systems could require more or less information to complete these tasks depending upon the goals of the system. While the exemplary system also uses a speech recognizer to understand the supplied spoken natural language, it could also receive the natural language via other means such as from typed input, or using handwriting recognition.
Using a predetermined grammar with a set of grammar rules, such a system parses the sentence into edges. Each edge represents a particular needed piece or set of information. The sentence can be represented by a parse tree as shown in the definitions above.
In a parsing operation, the system performs the parsing operation by matching grammar rules to the natural language input. For example, one grammar rule that can specify than an origin expression is the word “from” or the phrase “out of” followed by a city name. If the natural language input is “I want to fly from Boston to Dallas:, the system will locate the phrase “from Boston” and create a record in its internal data structures that these words match the origin expression grammar rules. This record is sometimes referred to as an edge. Systems look for predetermined grammars within collections of natural language. The system performs the parsing operation in accordance with the grammar as a way of forming/filling the desired edges with information from a natural language input. For example, the natural language interpreter identifies the initial city by seeking any of several origin city words such as <‘from’, ‘starting’, ‘leaving’, ‘beginning’, . . . >related to a city name from a list of cities. If the natural language interpreter finds an origin city and a city from the list, it will then fill the origin city edge. Similarly, the natural language interpreter identifies the destination city by seeking any of several destination city words such as <‘to’, ‘ending’, ‘arriving’, ‘finishing’, . . . >related to a city name from the list of cities. If the natural language interpreter finds a destination city and a predefined city, it will then fill the destination city edge. The grammar for the natural language interpreter similarly identifies the desired time of the flight by seeking any of several time words such as <‘o'clock’, ‘morning’, ‘afternoon’, ‘a.m.’, ‘p.m.’, ‘January’, ‘February’, . . . , ‘Monday’, ‘Tuesday’, . . . >related to a number. Using this technique, the natural language interpreter can interpret spoken utterances if they contain the requisite information, regardless of the ordering of the sentence. Thus, the sentence listed above as “I want to fly from Boston to San Francisco, tomorrow,” will provide the same result as the sentence, “Please book a flight to San Francisco, for flying tomorrow, from Boston.”
If the natural language interpreter is unable to identify the appropriate words, or a related city name, then the parsing will be terminated as unsuccessful. For example, if the caller says, “I want to fly to visit my mother,” the parsing will be unsuccessful. There is no source city word nor source city in the sentence. Further, even though the natural language interpreter finds a destination city word, it cannot find a city name that it recognizes.
For a natural language interpreter used in conjunction with a speech recognition system, the natural language interpreter is provided the speech recognizer's best determination of each word resulting from the recognition operation. A speech recognizer ‘listens’ to a user's spoken words, determines what those words are and presents those words in a machine format to the natural language interpreter. As part of the recognition operation, each word is provided a word confidence score which represents the confidence associated with each such word that the speech recognizer has for the accuracy of its recognition. Thus, it is generally considered useful to take into account the accent or speech patterns of a wide variety of users. A score is generated and associated with each word in the recognition step. Using the scores for each individual word is not entirely satisfactory because that collection of scores does not relate to the meaning the speaker intends to convey. If a single word has a very low word confidence score, the user may be required to re-enter the request.
In one prior approach, the scores for each of the words are combined into a single composite confidence score for the entire sentence. While this approach solves certain problems associated with using the scores for each word and provides a workable solution, it suffers from several drawbacks.
The composite confidence score described above is weighted by all the words in the entire sentence. In a long sentence, a speaker might use many words that are in essence unrelated to providing the information that the natural language interpreter needs. For example, if the speaker says, “Please help me to arrange a flight tomorrow to visit my friend for their birthday celebration leaving from Minneapolis and arriving in Cleveland.” In this example, assume that the speaker talks clearly, so that almost every word has a very high confidence score. A loud background noise occurs during the speaking of the words “Minneapolis” and “Cl

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Natural language speech recognition using slot semantic... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Natural language speech recognition using slot semantic..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Natural language speech recognition using slot semantic... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3005097

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.