Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition
Reexamination Certificate
1999-04-20
2002-10-08
Chawan, Vijay B (Department: 2654)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Recognition
C704S255000, C704S270000, C704S275000, C704S252000, C704S243000
Reexamination Certificate
active
06463413
ABSTRACT:
BACKGROUND AND SUMMARY OF THE INVENTION
The present invention relates generally to speech recognition systems, and more particularly, the invention relates to a system for training a speech recognizer for use in a small hardware device.
The marketing of consumer electronic products is very cost sensitive. Reduction of the fixed program memory size, the random access working memory size or the processor speed requirements results in lower cost, smaller and more energy efficient electronic devices. The current trend is to make these consumer products easier to use by incorporating speech technology. Many consumer electronic products, such as personal digital assistants (PDA) and cellular telephones, offer ideal opportunities to exploit speech technology, however they also present a challenge in that memory and processing power is often limited within the host hardware device. Considering the particular case of using speech recognition technology for voice dialing in cellular phones, the embedded speech recognizer will need to fit into a relatively small memory footprint.
To economize memory usage, the typical embedded speech recognition system will have very limited, often static vocabulary. In this case, condition-specific words, such as names used for dialing a cellular phone, could not be recognized. In many instances, the training of the speech recognizer is more costly, in terms of memory required or computational complexity, than is the speech recognition process. Small low-cost hardware devices that are capable of performing speech recognition may not have the resources to create and/or update the lexicon of recognized words. Moreover, where the processor needs to handle other tasks (e.g., user interaction features) within the embedded system, conventional procedures for creating and/or updating the lexicon may not be able to execute within a reasonable length of time without adversely impacting the other supported tasks.
The present invention addresses the above problems through a distributed speech recognition architecture whereby words and their associated speech models may be added to a lexicon on a fully customized basis. In this way, the present invention achieves three desirable features: (1) the user of the consumer product can add words to the lexicon, (2) the consumer product does not need the resources required for creating new speech models, and (3) the consumer product is autonomous during speech recognition (as opposed to during speech reference training), such that it does not need to be connected to a remote server device.
To do so, the speech recognition system includes a speech recognizer residing on a first computing device and a speech model server residing on a second computing device. The speech recognizer receives speech training data and processes it into an intermediate representation of the speech training data. The intermediate representation is then communicated to the speech model server. The speech model server generates a speech reference model by using the intermediate representation of the speech training data and then communicates the speech reference model back to the first computing device for storage in a lexicon associated with the speech recognizer.
For a more complete understanding of the invention, its objects and advantages refer to the following specification and to the accompanying drawings.
REFERENCES:
patent: 4751737 (1988-06-01), Gerson et al.
patent: 4754326 (1988-06-01), Kram et al.
patent: 4829577 (1989-05-01), Kuroda et al.
patent: 4903305 (1990-02-01), Gillick et al.
patent: 5477511 (1995-12-01), Englehardt
patent: 5488652 (1996-01-01), Bielby et al
patent: 5497447 (1996-03-01), Bahl et al.
patent: 5684925 (1997-11-01), Morin et al.
patent: 5715367 (1998-02-01), Gillick et al.
patent: 5749072 (1998-05-01), Mazukiewicz et al.
patent: 5806030 (1998-09-01), Junqua
patent: 5822728 (1998-10-01), Applebaum et al.
patent: 5825977 (1998-10-01), Morin et al.
patent: 5839107 (1998-11-01), Gupta et al.
patent: 5850627 (1998-12-01), Gould et al.
patent: 5854997 (1998-12-01), Sukeda et al.
patent: 5864810 (1999-01-01), Digalakis et al.
patent: 5884262 (1999-03-01), Wise et al.
patent: 5950157 (1999-09-01), Heck et al.
patent: 6055498 (2000-04-01), Neumeyer et al.
patent: 6070140 (2000-05-01), Tran
patent: 6266642 (2001-07-01), Franz et al.
Morin P., T.H. Applebaum, R. Bowman, Y. Zhao, and J.-C. Junqua, “Robust and Compact Multilingual Word Recognizers Using Features Extracted From a Phoneme Similarity Front-End”, 1998.
Applebaum, T.H., P. Morin, and B.A. Hanson, “A Phoneme-Similarity Based ASR Front-End”, 1996, vol. 1, pp. 33-36.
Morin, P. and T.H. Applebaum, “Word Hypothesizer Based on Reliably Detected Phoneme Similarity Regions”, 1995, pp. 897-900.
Applebaum Ted H.
Junqua Jean-Claude
Chawan Vijay B
Harness & Dickey & Pierce P.L.C.
Matsushita Electrical Industrial Co. Ltd.
LandOfFree
Speech recognition training for small hardware devices does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Speech recognition training for small hardware devices, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Speech recognition training for small hardware devices will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2925599