Segment-based apparatus and method for speech recognition by ana

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

395 264, 395 262, 395 246, 395 248, 395 249, 395 25, 395 251, G10L 500, G10L 506

Patent

active

056257490

ABSTRACT:
Phonetic recognition is provided by capturing dynamical behavior and statistical dependencies of the acoustic attributes used to represent a subject speech waveform. A segment based framework is employed. Temporal behavior is modelled explicitly by creating dynamic templates, called tracks, of the acoustic attributes used to represent the speech waveform, and by generating the estimation of the acoustic spatio-temporal correlation structure. An error model represents this estimation as the temporal and spatial correlations between the input speech waveform and track generated speech segment. Models incorporating these two components (track and error estimation) are created for both phonetic units and for phonetic transitions. Phonetic contextual influences are accounted for by merging context-dependent tracks and pooling error statistics over the different contexts. This allows for a large number of contextual models without compromising the robustness of the statistical parameter estimates. The transition models also supply contextual information.

REFERENCES:
patent: 4994983 (1991-02-01), Landell et al.
patent: 5023911 (1991-06-01), Gerson
patent: 5036539 (1991-07-01), Wrench, Jr. et al.
patent: 5199077 (1993-03-01), Wilcox et al.
patent: 5333236 (1994-07-01), Bahl et al.
John R. Deller, Jr., John G. Proakis, and John H. L. Hansen "Discrete-Time Processing of Speech Signals", Macmillan, pp. 634-638 1993.
Thomas W. Parsons, "Voice and Speech Processing", McGraw-Hill, pp. 172-174 1987.
Digalakis, Vassilios V., "Segment-Based Stochastic Models of Spectral Dynamics for Continuous Speech Recognition", Ph.D. Thesis, Boston University (1992).
Digalakis, Vassilios V., et al., "ML Estimation of a Stochastic Linear System with the EM Algorithm and Its Application to Speech Recognition", IEEE Transactions on Speech and Audio Processing, 1(4):431-442 (Oct. 1993).
Digalakis, Vassilios, V., et al., "A Dynamical System Approach to Continuous Speech Recognition", Proceedings of the International Conference on Acoustics, Speech and Signal Processing, pp. 289-292 (May 1991).
Digalakis, Vassilios V., et al., "Improvements in the Stochastic Segment Model for Phoneme Recognition" Proceedings 2nd DARPA Workshop on Speech and Natural Language, pp. 332-338 (Oct. 1989).
Digalakis, Vassilios V., et al., "Fast Algorithms for Phone Classification and Recognition Using Segment-Based Models", IEEE Transactions on Signal Processing, pp. 1-31 (Dec. 1992).
Ostendorf, Mari, et al., "Context Modeling with the Stochastic Segment Model", IEEE Transactions on Signal Processing, 40(6):1584-1587 (Jun. 1992).
Ostendorf, Mari, et al., "Continuous Word Recognition Based on the Stochastic Segment Model", DARPA Proceedings on Continuous Speech Recognition Workshop, (Sep. 1992).
Ostendorf, Mari and Roukos, Salim, "A Stochastic Segment Model for Phoneme-Based Continuous Speech Recognition", IEEE Transactions on Acoustics, Speech, and Signal Processing, 37(12):1857-1869 (Dec. 1989).
Roucos, Salim, et al., "Stochastic Segment Modelling Using the Estimate-Maximize Algorithm", Proceedings of the International Conference on Acoustics, Speech and Signal Processing, pp. 127-130 (Apr. 1988).
Kannan, Ashvin., et al., "Maximum Likelihood Clustering of Gaussians for Speech Recognition", IEEE Transactions on Speech and Audio Processing (to appear Jul. 1994) (Nov. 1, 1993).
Kannan, Ashvin and Ostendorf, Mari, "A Comparison of Trajectory and Mixture Modeling in Segment-Based Word Recognition", IEEE Proceedings of the International Conference on Acoustics, Speech and Signal Processing, pp. 1-4, (1993).
Kimball, Owen, et al., "Context Modeling with the Stochastic Segment Model", IEEE Transactions on Signal Processing, 40(6):1584-1587 (Jun. 1992).
Kimball, Owen and Ostendorf, Mari, "On the Use of Tied-Mixture Distributions" Proceedings ARPA Workshop on Human Language Technology, pp. 102-107, (Mar., 1993).
Davis, Steven B. and Mermelstein, Paul, "Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences", IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(4):357-366 (Aug. 1980).
Deng, Li, "A Generalized Hidden Markov Model with State-Conditioned Trend Functions of Time for the Speech Signal", Signal Processing, 27:65-78 (1992).
Duda, Richard O. and Hart, Peter E., "Pattern Classification and Scene Analysis", John Wiley & Sons (New York) (1973).
William D. Goldenthal and James R. Glass, "Modelling Spectral Dynamics for Vowel Classification", Proceeding of Eurospeech 93, pp. 289-292, (Berlin, Germany) (Sep. 1993).
Gong, Yifan and Haton, Jean-Paul, "Stochastic Trajectory Modeling for Speech Recognition", Proceedings ICASSP 94, Australia, pp. I-57-I-60 (Mar. 1994).
Robinson, Tony, "Several Improvements to a Recurrent Error Propagation Network Phone Recognition System", pp. 1-11, Technical Report, Cambridge University Engineering Dept, (Sep. 30, 1991).
Russell, Martin, "A Segmental HMM for Speech Pattern Modelling", Proceedings of International Conference on Acoustics, Speech and Signal Processing 93, pp. 499-502, (Minneapolis, MN) (Apr. 1993).
Marcus, Jeffrey N., "Phonetic Recognition in a Segment-Based HMM", Proceedings International Conference on Acoustics, Speech and Signal Processing, pp. 478-482 (Minneapolis, MN) (Apr. 1993).

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Segment-based apparatus and method for speech recognition by ana does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Segment-based apparatus and method for speech recognition by ana, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Segment-based apparatus and method for speech recognition by ana will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-713299

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.