Context dependent phoneme networks for encoding speech...

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S254000, C704S256000, C704S275000

Reexamination Certificate

active

06182038

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates generally to computer speech recognition.
BACKGROUND OF THE INVENTION
Recent advances in computer hardware and software have allowed computer speech recognition (CSR) to cross the threshold of usability. Systems are now available for high end personal computers that can be used for large vocabulary, continuous speech dictation. To obtain adequate performance, such systems need to be adapted to a specific user's voice and environment of usage. In addition, these systems can only recognize words drawn from a certain vocabulary and are usually tied to a particular language model, which captures the relative probabilities of different sequences of words. Without all of these constraints, it is very difficult to get adequate performance from a CSR system.
In most CSR systems, the user and environment specific part, or acoustic models, are usually separate to the vocabulary and language models. However, because of the above constraints, any application that requires speech recognition needs access to both the user/environment specific acoustic models and the application specific vocabulary and language models.
This is a major obstacle to moving CSR systems beyond standalone dictation, to systems where many different users will need to access different applications, possibly in parallel and often over the internet or a local area network (LAN). The reason is that either: (a) each application will have to keep separate acoustic models for each user/environment; or (b) each user will need to maintain separate sets of vocabularies and language models for each application they wish to use. Since the size of acoustic and language models are typically in the order of megabytes to tens of megabytes for a medium to large vocabulary application, it follows that in either scenario (a) or (b), the systems' resources are going to be easily overwhelmed.
One possibility is to store the acoustic models on a different machine to the vocabulary and language models, and connect the machines via a LAN or the internet. However, in either (a) or (b), enormous amounts of network traffic will be generated as megabytes of data are shifted to the target recognizer.
Thus, a need exists for a CSR system that is independent of the vocabulary and language models of an application without sacrificing performance in terms of final recognition accuracy.


REFERENCES:
patent: 5293584 (1994-03-01), Brown et al.
patent: 5475792 (1995-12-01), Stanford et al.
patent: 5515475 (1996-05-01), Gupta et al.
patent: 5535120 (1996-07-01), Chong et al.
patent: 5555344 (1996-09-01), Zunkler
patent: 5615296 (1997-03-01), Stanford et al.
patent: 5621859 (1997-04-01), Schwartz et al.
patent: 5651096 (1997-07-01), Pallakoff et al.
patent: 5715367 (1998-02-01), Gillick et al.
patent: 5745649 (1998-04-01), Lubensky
patent: 5805710 (1998-09-01), Higgins et al.
patent: 5867817 (1999-02-01), Catallo et al.
patent: 5915001 (1999-06-01), Uppaluru
patent: 5960399 (1999-09-01), Barclay et al.
patent: 2230370 (1990-10-01), None
patent: 224023 (1991-07-01), None
patent: WO 98/08215 (1998-02-01), None
“Specialized Language Models for Speech Recognition”, IBM Technical Disclosure Bulletin, vol. 38, No. 2, Feb. 1995, pp. 155-157, XP000502428.
S.J. Young, M.G. Brown, J.T. Foote, G.J.F. Jones and K. Sparck Jones. Acoustic Indexing For Multimedia Retrieval and Browsing. In Proc. ICASSP 97, pp. 1-4, Munich, Germany, Ap. 1997. IEEE.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Context dependent phoneme networks for encoding speech... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Context dependent phoneme networks for encoding speech..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Context dependent phoneme networks for encoding speech... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2528817

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.