Interactive command recognition enhancement system and method

Data processing: speech signal processing – linguistics – language – Speech signal processing – Application

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S235000

Reexamination Certificate

active

06792408

ABSTRACT:

BACKGROUND
The disclosures herein relate generally to computer based speech recognition interfaces and more particularly to an enhanced interactive command recognition system.
There currently exist a wide variety of commercially available word recognition systems. Examples of such systems include speech, or spoken word, recognition systems and hand-writing recognition systems. Even spell-checking systems can be considered to be a form of word recognition systems in the sense that they find the most probable words in a dictionary that most closely match a string of characters.
The systems of interest attempt to match acoustic input with stored acoustic patterns based on parameters that include frequency, relative amplitude, and duration to identify words in a predetermined vocabulary. In the case of handwriting, input strokes are analyzed based on parameters including height, angle, and spacing.
Building on analysis of the physical input such as that described above (e.g., acoustic patterns for audio input and input strokes for handwritten input), context can be used to enhance these systems. For example, a dictation system that receives the input “the next line of computers to be introduced will include high quality microphones” would simply enter that text as a sentence. If the system received “the next line of computers to be introduced will include high quality microphones next line”, the system would enter “The next line of computers to be introduced will include high quality microphones.” and go to the next line. Note that the first “next line” is an implicit command to enter text while the second “next line” is an explicit command to go to the next line. The difference in response is based on the context in which the words are used. Context (including grammar-based) can also be used to resolve words that sound identical such as to, too, and two.
Building on the concepts of input pattern recognition and context is equivalence of meaning. This concept is useful in command recognition and search engines. The object is to determine that two words or phrases are equivalent. This can be accomplished through lists of equivalent words or phrases. In addition, probabilities that each word/phrase in a list is equivalent to the input word/phrase determined by physical pattern matching can be maintained.
A basic distinction can be made between features and techniques in a system that distinguish inputs based on their physical characteristics and those that distinguish inputs based on their meaning. Systems that base their results on techniques that attempt to determine what the input means might be considered a form of natural language interface. Systems combining physical and meaning, or logic, based techniques may be embedded, as in the case of car phones, may be used to implement dictation functions or search engines, may be used purely for command recognition, or may exist as a form of natural language interface to a variety of computer applications. The terms logical and logical techniques will be used to refer nonphysical
on-pattern-matching or intent/meaning based recognition results and processes.
In any of the above cases, it will be recognized that it is possible to determine a probability that the system has properly determined either the identity, in the case of physical recognition, or the meaning, in the case of logical recognition, of a word or sequence of words, and that this probability may be assigned a numerical value. Examples of such systems are shown and described in U.S. Pat. No. 4,783,803 to Baker et al., U.S. Pat. No. 5,960,394 to Gould et al., and other prior art patents.
Clearly, in the case of a system that combines both physical and logical recognition, there will be instances in which the system cannot identify with sufficient certainty a particular word or series of words entered (spoken or written) by the user, as well as instances in which the system fails to recognize the word or words as a logical entity such as a command. In either case, the feedback to the user available from currently available systems would be something along the lines of “please repeat your statement” or “I think you said . . . ”, leaving the user with no clear understanding of whether the system failed to recognize a word or words individually, or failed to recognize meaning of the word or words.
Therefore, what is needed is a command recognition system that provides more detailed feedback to the user as to why an input was not recognized by the system. For example a dictation system might receive the input “the fisherman lost his hook end line”. The system could attempt to elicit clearer pronunciation through a request to repeat the statement more clearly if the acoustic certainty (between and and end) was low. If the acoustic certainty of end was high, the system might elicit a rephrase of the command because “end line” was not a legal command (The system may disallow it due to ambiguity; it might mean last line of page, last line of document, etc.) The example cited here is based on a dictation system, but even greater benefit would be derived from hands free, eyes free audio command systems or information access systems (e.g., search engines) driven by audio or handwritten input.
SUMMARY
One embodiment, accordingly, provides an interactive command recognition system. In a preferred embodiment, responsive to a user inputting a command, or word string, to the interactive command recognition system, a physical recognition portion of the system performs physical recognition functions on the input word string and assigns to a number of candidate matches for each of the individual words of the command, a physical score based on the probability that the word was properly recognized by the system, and then computes an average A of these scores. Similarly, a logical recognition portion of the system performs recognition functions on the output of the physical recognition portion, assigns to each of its results a score based on the probability that the word is part of a recognized command, and then computes an average B of these scores.
These averages A and B can then be used in a variety of manners, depending on the particular implementation of the command recognition system. In one embodiment, if B is greater than a predetermined logical threshold, the command is executed. If B is less than the predetermined logical threshold and A is greater than a predetermined physical threshold, indicating that the words were but the command was not understood by the system, the user is advised to rephrase the command. In contrast, if both A and B are less than the respective thresholds, indicating that neither the words nor the command was understood by the system, the user is advised to repeat the command more clearly.
In another embodiment, the averages A and B are weighted using appropriate constants and a sum of the weighted averages is compared to a predetermined threshold. In this embodiment, if the sum of the weighted averages is greater than the predetermined threshold, the command is executed. If the sum of the weighted averages is less than the predetermined threshold, the averages A and B are reweighted using the same or different constants than those used above and a determination is made whether the reweighted average A is greater than the reweighted constant B. If so, the user in advised to rephrase the command; otherwise, the user is advised to repeat the command more clearly.
In yet another embodiment, the input word string is a search request. In this embodiment, a determination is made whether the quality for all matches is less than a Match Quality Threshold (“MQT”). Search engines will frequently provide quality ratings for each of the matches returned to the requester, such as one to five stars or a percentage to indicate the relative quality of the matches. The MQT is a value in similar units that indicates that adequate matches were found for the request. If all results are not less than the MQT, the search results are acceptable and output to the user with standard

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Interactive command recognition enhancement system and method does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Interactive command recognition enhancement system and method, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Interactive command recognition enhancement system and method will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3195782

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.