Multi-stage large vocabulary speech recognition system and...

Data processing: speech signal processing – linguistics – language – Speech signal processing – Application

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S270000, C704S251000, C379S088030

Reexamination Certificate

active

06751595

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates in general to speech recognition and, more particularly, to interactive speech applications for use in automated assistance systems.
BACKGROUND OF THE INVENTION
Pattern recognition generally, and recognition of patterns in continuous signals such as speech signals, has been a rapidly developing field. A limitation in many applications has been the cost of providing sufficient processing power for the complex calculations often required. This is particularly the case in speech recognition, all the more so when real-time response is required, for example to enable automated directory inquiry assistance, or for control operations based on speech input. Information can be gathered from callers and/or provided to callers, and callers can be connected to appropriate parties within a telephone system. To simulate the speed of response of a human operator, and thus avoid a perception of “unnatural” delays, which can be disconcerting, the spoken input needs to be recognized very quickly after the end of the spoken input.
The computational load varies directly with the number of words or other elements of speech (also referred to as “orthographies”), which are modeled and held in a dictionary database, for comparison to the spoken input. The number of orthographies is also known as the size of vocabulary of the system. The computational load also varies according to the complexity of the models in the dictionary, and how the speech input is processed into a representation ready for the comparison to the models. Also, the actual algorithm for carrying out the comparison is a factor.
Numerous attempts have been made over many years to improve the trade off between computational load, accuracy of recognition, and speed of recognition. Depending on the size of vocabularies used, and the size of each model, both the memory requirements and the number of calculations required for each recognition decision may limit the speed/accuracy/cost trade off. For useable systems having a tolerable recognition accuracy, the computational demands are high. Despite continuous refinements to models, speech input representations, and recognition algorithms, and advances in processing hardware, there remains great demand to improve the above mentioned trade off, especially in large vocabulary systems, such as those having greater than 100,000 words.
SUMMARY OF THE INVENTION
The present invention is directed to a system and method for speech recognition using multiple processing stages that use different vocabulary databases to improve processing time, efficiency, and accuracy in speech recognition. The entire vocabulary is divided into smaller vocabulary subsets, which are associated with particular keywords. A small vocabulary subset is generated or retrieved based on certain information, such as a calling party's locality. A user is prompted to provide input information, such as the locality in which a business whose phone number is requested is located, in the form of a spoken utterance to the system. If the utterance matches one of the entries in the initial small vocabulary subset, then the utterance is considered to be recognizable. If the utterance is not recognizable when compared to the initial small vocabulary subset, then the utterance is stored for later use. The user is then prompted for a keyword related to another subset of words in which his initial utterance may be found. A vocabulary subset associated with the received keyword is generated or retrieved. The initial stored utterance is then retrieved and compared to the newly loaded vocabulary subset. If the utterance matches one of the entries in the newly loaded vocabulary subset, then the utterance is recognizable. Otherwise, it is determined that the initial utterance was unrecognizable, and the user is prompted to repeat the initial utterance.
The foregoing and other aspects of the present invention will become apparent from the following detailed description of the invention when considered in conjunction with the accompanying drawings.


REFERENCES:
patent: 5797116 (1998-08-01), Yamada et al.
patent: 5839106 (1998-11-01), Bellegarda
patent: 5905773 (1999-05-01), Wong
patent: 5960394 (1999-09-01), Gould et al.
patent: 5987414 (1999-11-01), Sabourin et al.
patent: 6018708 (2000-01-01), Dahan et al.
patent: 6092045 (2000-07-01), Stubley et al.
patent: 6122361 (2000-09-01), Gupta
patent: 6173266 (2001-01-01), Marx et al.
patent: 6195635 (2001-02-01), Wright
patent: 2002/0065657 (2002-05-01), Reding et al.
patent: WO 97/37481 (1997-10-01), None

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Multi-stage large vocabulary speech recognition system and... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Multi-stage large vocabulary speech recognition system and..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Multi-stage large vocabulary speech recognition system and... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3335267

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.