Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition
Reexamination Certificate
2000-04-27
2004-08-24
Abebe, Daniel (Department: 2655)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Recognition
C704S255000
Reexamination Certificate
active
06782362
ABSTRACT:
BACKGROUND OF THE INVENTION
The present invention relates to speech recognition. In particular, the present invention relates to the use of segment models to perform speech recognition.
In speech recognition systems, an input speech signal is converted into words that represent the verbal content of the speech signal. This conversion begins by converting the analog speech signal into a series of digital values. The digital values are then passed through a feature extraction unit, which computes a sequence of feature vectors based on the digital values. Each feature vector is typically multi-dimensional and represents a single frame of the speech signal.
To identify a most likely sequence of words, the feature vectors are applied to one or more models that have been trained using a training text. Typically, this involves applying the feature vectors to a frame-based acoustic model in which a single frame state is associated with a single feature vector.
Recently, however, segment models have been introduced that associate multiple feature vectors with a single segment state. The segment models are thought to provide a more accurate model of large-scale transitions in human speech.
Although current segment models provide improved modeling of large-scale transitions, their training time and recognition time are less than optimum. As such, more efficient segment models are needed.
SUMMARY OF THE INVENTION
A method and apparatus determine the likelihood of a sequence of words based in part on a segment model. The segment model includes trajectory expressions formed as the product of a generation matrix and a parameter matrix. The likelihood of the sequence of words is based in part on a segment probability. The segment probability is derived in part by matching the trajectory expressions to a feature vector matrix that contains a sequence of feature vectors for a segment of speech.
Aspects of the method and apparatus also include training the segment model using such a segment probability.
REFERENCES:
patent: 6009392 (1999-12-01), Kanevsky et al.
patent: 6301561 (2001-10-01), Saul
patent: 6401064 (2002-06-01), Saul
patent: 6542866 (2003-04-01), Jiang et al.
“Probabilistic-trajectory segmental HMMs”,Computer Speech and Language, by Wendy J. Holmes et al., Article No. csla. 1998.0048, pp. 3-37 (1999).
“Parametric Trajectory Mixtures for LVCSR”, by Man-hung Siu et al., ICSLP-1998, 4 pages.
“Speech Recognition Using Hidden Markov Models with Polynomial Regression Functions as Nonstationary States”, by Li Deng et al., IEEE Transactions on Speech on Audio Processing, vol. 2, No. 4, pp. 507-520 (Oct. 1994).
“From HMM's to Segment Models: A Unified View of Stochastic Modeling for Speech Recognition”, by Mari Ostendorf et al., IEEE Transactions on Speech and Audio Processing, vol. 4, No. 5, pp. 360-379 (Sep. 1996).
Hon Hsiao-Wuen
Wang Kuansan
LandOfFree
Speech recognition method and apparatus utilizing segment... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Speech recognition method and apparatus utilizing segment..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Speech recognition method and apparatus utilizing segment... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3320242