Automatically retraining a speech recognition system

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S231000, C704S270000

Reexamination Certificate

active

06789062

ABSTRACT:

FIELD OF THE INVENTION
The invention relates generally to speech recognition systems, and relates more specifically to an approach for automatically retraining a speech recognition system.
BACKGROUND OF THE INVENTION
Most speech recognition systems are “trained” for specific applications or contexts. Training a speech recognition system generally involves generating a statistical model for a sample set of speech utterances that are representative of a specific application or context. The sample set of speech utterances is typically referred to as a “training set.” Generating a statistical model for a training set involves two fundamental steps. First, measurements are performed on the training set to generate a body of measurement data for the training set that specifies attributes and characteristics of the training set. Some training sets require a large amount of measurement data because of the number and character of speech utterances contained in the training set. Furthermore, a large amount of measurement data is often desirable since the accuracy of statistical models generally increases as the amount of measurement data increases. Human review and confirmation of measurement results is often employed to improve the accuracy of the measurement data, which can be very labor intensive and can take a long time.
Once the measurement data has been generated, statistical analysis is performed on the measurement data to generate statistical model data that defines a statistical model for the measurement data. The statistical model is a multi-dimensional mathematical representation derived from the training set.
Once a statistical model has been generated, a received speech utterance is evaluated against the statistical model in an attempt to match the received speech utterance to a speech utterance from the training set. Sometimes separate statistical models are used for different applications and contexts to improve accuracy.
Statistical models periodically require retraining to account for changes in the applications or contexts for which the statistical models were originally determined. For example, a particular application may use new words or subjects that are not represented in the statistical model for the particular application. As a result, the statistical model may not provide a high level of accuracy with respect to the new words or subjects. Retraining allows the statistical model to reflect the new words or subjects.
Conventional retraining is usually performed in a manual, offline process by supplementing the training data with the new words or subjects and then rebuilding the statistical model from the supplemented training data. One problem with this approach is that manual retraining can be very labor intensive (requiring substantial human supervision) and take a long time to implement. This means that statistical models cannot be quickly updated to recognize changes in utterances. Another problem with conventional retraining techniques is that the amount of measurement data that must be maintained continues to grow over time as the number and size of training sets increases. As a result, the measurement data requires an ever increasing amount of system resources, e.g., non-volatile storage such as disks, to store the data. For speech recognition systems requiring a large number of statistical models, e.g., for different applications, different users, or different subject matter, the amount of measurement data can be enormous.
Yet another problem with conventional retraining approaches is that new measurement data is often not adequately represented in statistical models. This occurs, for example, during retraining when a relatively small amount of new measurement data is processed with a relatively larger amount of prior measurement data to generate new statistical model data. The relatively larger amount of prior measurement data tends to dilute the effect of the relatively smaller amount of new measurement data. As a result, speech utterances associated with the new measurement data may not be adequately represented in the new statistical model data, resulting in a lower level of accuracy.
Based on the foregoing, there is a need for an approach for retraining speech recognition systems that avoids the limitations in the prior approaches.
There is a particular need for a computer-implemented approach for automatically retraining a speech recognition system that requires a reduced amount of human supervision. There is also a need for an approach for retraining a speech recognition system that reduces the amount of prior measurement data that must be maintained.
There is a further need for a retraining approach that addresses the problem of new measurement data dilution.
SUMMARY OF THE INVENTION
The foregoing needs, and other needs and objects that will become apparent from the following description, are achieved by the present invention, which comprises, in one aspect, a method for automatically retraining a speech recognition system. According to the method, prior measurement data that was determined for a prior set of speech utterances is retrieved. New measurement data is determined for a new set of speech utterances. A weighting factor is applied to the new measurement data to generate weighted new measurement data. New statistical model data is generated using the prior measurement data and the weighted new measurement data.
According to another aspect, a method is provided for automatically retraining a speech recognition system. Prior measurement data that was determined for a prior set of speech utterances is retrieved. New measurement data is determined for a new set of speech utterances. A weighting factor is applied to the prior measurement data to generate weighted prior measurement data. New statistical model data is generated using the weighted prior measurement data and the new measurement data.
According to another aspect, a method is provided for automatically retraining a speech recognition system. A first set of speech utterances is retrieved. Then, first measurement data is determined for the first set of speech utterances. First statistical model data is determined based upon the first measurement data. A statistical model is determined based upon the first statistical model data. A second set of speech utterances is retrieved. Second measurement data is determined for the second set of speech utterances. Second statistical model data is determined based upon the second measurement data. Finally, an updated statistical model is determined using the first statistical model data and the second statistical model data and without using either the first measurement data or the second measurement data.
According to another aspect a speech recognition system comprises a storage medium and a retraining mechanism communicatively coupled to the storage medium. The retraining mechanism is configured to retrieve prior measurement data determined for a prior set of speech utterances from the storage medium. The retraining mechanism is also configured to determine new measurement data for a new set of speech utterances. The retraining mechanism is further configured to apply a weighting factor to the new measurement data to generate weighted new measurement data. The retraining mechanism is configured to generate new statistical model data using the prior measurement data and the weighted new measurement data.


REFERENCES:
patent: 5737487 (1998-04-01), Bellegarda et al.
patent: 5799276 (1998-08-01), Komissarchik et al.
patent: 5812972 (1998-09-01), Juang et al.
patent: 5864810 (1999-01-01), Digalakis et al.
patent: 5893059 (1999-04-01), Raman
patent: 5960394 (1999-09-01), Gould et al.
patent: 6014624 (2000-01-01), Raman
patent: 6070136 (2000-05-01), Cong et al.
patent: 6073097 (2000-06-01), Gould et al.
patent: 6101468 (2000-08-01), Gould et al.
patent: 6134527 (2000-10-01), Meunier et al.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Automatically retraining a speech recognition system does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Automatically retraining a speech recognition system, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Automatically retraining a speech recognition system will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3222821

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.