User-cued speech recognition

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S235000, C704S275000

Reexamination Certificate

active

06195635

ABSTRACT:

BACKGROUND
This invention relates to computer-implemented speech recognition.
A typical speech recognition system includes a recognizer and a stored vocabulary of words which the recognizer can recognize. The recognizer receives information about utterances, and delivers a corresponding recognized word or string of recognized words drawn from the vocabulary.
Speech recognition systems, which sometimes misrecognize speech, also provide ways for users to correct recognition errors. In a simple case, the user deletes an incorrect word and types a replacement. In some systems, selecting an incorrect word triggers the display of a list of alternative words from which the user may select a replacement. Selecting the incorrect word also may trigger a prompt to the user to speak the misrecognized utterance again, perhaps more slowly and clearly. The misrecognized word is then replaced with the result of recognizing the new utterance spoken by the user.
SUMMARY
In one aspect, recognition of speech by a speech recognizer may be improved by receiving deliberately contiguously repeated spoken utterances corresponding to a speech element, and recognizing fewer instances of the speech element than the number of repeated spoken utterances. The speech element may be, for example, a word, a phrase, or a sentence.
At least one of the repeated spoken utterances may be spoken by a user after misrecognition of another one of the repeated spoken utterances is apparent. An instance of the speech element may be recognized for each of the repeated spoken utterances if the speech element is in a predetermined class of speech elements. The predetermined class may include, for example, speech elements which may properly be repeated in a language recognized by the speech recognizer or commands.
Prior to receiving the deliberately contiguously repeated spoken utterances, a spoken utterance corresponding to the speech element may be received and misrecognized. The spoken utterance and the repeated spoken utterances may be used to recognize the speech element.
Recognizing the speech element may include identifying possible recognized speech elements for the repeated spoken utterances and selecting one of the possible recognized speech elements as a recognized speech element. Selecting one of the possible recognized speech elements as a recognized speech element may include developing scores for the possible recognized speech elements and selecting as the recognized speech element a possible recognized speech element with an optimal score. Possible recognized speech elements may be identified for a predetermined number of the repeated spoken utterances.
Recognizing the speech element may include applying a recognition process directly to representations of speech utterance waveforms for at least two of the repeated spoken utterances without separately recognizing a speech element for each of the spoken utterances.
Among the advantages of the invention are one or more of the following.
Deliberately repeating a word, phrase, or sentence decreases the likelihood that the word, phrase, or sentence will be misrecognized. Time spent by the user correcting recognition errors is decreased as fewer recognition errors occur. Because the time needed to repeat a word, phrase, or sentence is typically small compared to the time required to correct a recognition error, overall recognition time is decreased. Furthermore, because a user must typically stop speaking in order to correct a recognition error, a reduction in the number of recognition errors allows the user to speak naturally for longer periods of time.
Another advantage of the invention is that it provides the user with a degree of interactive control over the accuracy of speech recognition. The user may, for example, increase the likelihood that selected words will be recognized correctly by deliberately repeating the selected words, without interrupting the flow of speech. The likelihood that the selected words will be recognized correctly may increase in proportion to the number of repetitions.
Similarly, another advantage of the invention is that it increases the accuracy of error correction. If a word is misrecognized, the user may repeat the word more than once. The likelihood that the misrecognized word will be replaced with the correct word may increase in proportion to the number of repetitions. This decreases the likelihood that the attempt at error correction will fail, requiring the user to attempt error correction again.
The techniques may be implemented in computer hardware or software, or a combination of the two. However, the techniques are not limited to any particular hardware or software configuration; they may find applicability in any computing or processing environment that may be used for speech recognition. Preferably, the techniques are implemented in computer programs executing on programmable computers that each include a processor, a storage medium readable by the processor (including volatile and nonvolatile memory and/or storage elements), at least one input device, and one or more output devices. Program code is applied to data entered using the input device to perform the functions described and to generate output information. The output information is applied to the one or more output devices.
Each program is preferably implemented in a high level procedural or object oriented programming language to communicate with a computer system. However, the programs can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language.
Each such computer program is preferably stored on a storage medium or device (e.g., CD-ROM, hard disk or magnetic diskette) that is readable by a general or special purpose programmable computer for configuring and operating the computer when the storage medium or device is read by the computer to perform the procedures described. The system may also be considered to be implemented as a computer-readable storage medium, configured with a computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner.
Other features and advantages of the invention will become apparent from the following description, including the drawings, and from the claims.


REFERENCES:
patent: 4355302 (1982-10-01), Aldefeld et al.
patent: 4674065 (1987-06-01), Lange et al.
patent: 4783803 (1988-11-01), Baker et al.
patent: 4805218 (1989-02-01), Bamberg et al.
patent: 4805219 (1989-02-01), Baker et al.
patent: 4829576 (1989-05-01), Porter
patent: 4833712 (1989-05-01), Bahl et al.
patent: 4866778 (1989-09-01), Baker
patent: 4914704 (1990-04-01), Cole et al.
patent: 4931950 (1990-06-01), Isle et al.
patent: 5027406 (1991-06-01), Roberts et al.
patent: 5031217 (1991-07-01), Nishimura
patent: 5033087 (1991-07-01), Bahl et al.
patent: 5036538 (1991-07-01), Oken et al.
patent: 5202952 (1993-04-01), Gillick et al.
patent: 5231670 (1993-07-01), Goldhor et al.
patent: 5329609 (1994-07-01), Sanada et al.
patent: 5377303 (1994-12-01), Firman
patent: 5386494 (1995-01-01), White
patent: 5428707 (1995-06-01), Gould et al
patent: 5497373 (1996-03-01), Hulen et al.
patent: 5632002 (1997-05-01), Hashimoto et al.
patent: 5754972 (1998-05-01), Baker et al.
patent: 5765132 (1998-06-01), Roberts et al.
patent: 5794189 (1998-08-01), Gould
HARK™ Prototyper Programmer's Guide, Release 2.0, Jun. 1994, Bolt Beranck and Newman, Inc., pp3-14 and 3-15 and XV.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

User-cued speech recognition does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with User-cued speech recognition, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and User-cued speech recognition will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2568012

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.