Constructing Markov models of words from multiple utterances

Electrical audio signal processing systems and devices – One-way audio signal program distribution – Public address system

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

G10L 500

Patent

active

047590688

ABSTRACT:
Speech recognition is improved by splitting each feneme string at a consistent point into a left portion and a right portion. The present invention addresses the problem of constructing fenemic baseforms which take into account variations in pronunciation of words from one utterance thereof to another. Specifically, the invention relates to a method of constructing a fenemic baseform for a word in a vocabulary of word segments including the steps of: (a) transforming multiple utterances of the word into respective strings of fenemes; (b) defining a set of fenemic Markov model phone machines; (c) determining the best single phone machine P.sub.1 for producing the multiple feneme strings; (d) determining the best two phone baseform of the form P.sub.1 P.sub.2 or P.sub.2 P.sub.1 for producing the multiple feneme strings; (e) aligning the best two phone baseform against each feneme string; (f) splitting each feneme string into a left portion and a right portion with the left portion corresponding to the first phone machine of the two phone baseform and the right portion corresponding to the second phone machine of the two phone baseform; (g) identifying each left portion as a left substring and each right portion as a right substring; (h) processing the set of left substrings and the set of right substrings in the same manner as the set of feneme strings corresponding to the multiple utterances including the further step of inhibiting further splitting of a substring when the single phone baseform thereof has a higher probability of producing the substring than does the best two phone baseform; and (k) concatenating the unsplit single phones in an order corresponding to the order of the feneme substrings to which they correspond.

REFERENCES:
patent: 4038503 (1977-07-01), Moshier
patent: 4181821 (1980-01-01), Pirz et al.
patent: 4319085 (1982-03-01), Welch et al.
patent: 4348553 (1982-09-01), Baker et al.
patent: 4481593 (1984-11-01), Bahler
patent: 4513436 (1985-05-01), Nose et al.
patent: 4587670 (1986-05-01), Levinson et al.
patent: 4590605 (1986-05-01), Hataoka et al.
patent: 4593367 (1986-06-01), Slack et al.
patent: 4618983 (1986-10-01), Nishioka et al.
M. Cravero et al. "Phonetic Units for Hidden Markov Models", CSELT Technical Reports, vol. XIV No. 2 Apr. 1986, pp. 121-125.
L. R. Rabiner et al., "Recent Developments in the Application of Hidden Markov Models to Speaker-Independent Isolated Word Recognition", AT&T 1985 article, p. 1214.
H. Boulard et al., "Speaker Dependent Connected Speech Recognition Via Phonemic Markov Models", 1985 IEEE, pp. 1213-1216.
Douglas E. Paul et al., "Training of HMM Recognizers by Simulated Annealing", 1985, IEEE, pp. 13-16.
Yves Kamp et al., "State Reduction in Hidden Markov Chains Used for Speech Recognition", 1985, IEEE, pp. 1138-1145.
"Isolated Word Recognition Using Hidden Markov Models", K. Sugawara, 1985, IEEE, pp. 1-4.
R. Schwartz, "Context-Dependent Modeling for Acoustic-Phonetic Recognition of Continuous Speech", 1985, IEEE, pp. 1205-1208.
J. F. Mari et al., "Speaker Independent Connected Digit Recognition Using Hidden Markov Models", 1985, Speech Tech, pp. 127-132.
R. Schwartz et al., "Improved Hidden Markov Modeling of Phonemes for Continuous Speech Recognition", 1984, IEEE, pp. 35.6.1-35.6.4.
S. E. Levinson et al., "Speaker Independent Isolated Digit Recognition Using Hidden Markov Models", 1983, IEEE, pp. 1049-1052.
Jean-Paul Haton et al., "Problems in the Design and Use of a Connected Speech Understanding System", 1982, IEEE, pp. 1616-1620.
D. M. Choy et al., "Speech Compression by Phoneme Recognition", 1982, IBM TDB, vol. 25, No. 6, pp. 2884-2886.
Bahl, et al., "Interpolation of Estimators Derived from Sparse Data", 1981, IBM TDB, vol. 24, No. 4, pp. 2038-2041.
Bahl, et al., "Faster Acoustic Match Computation", 1980, IBM TDB, vol. 23, No. 4, pp. 1718-1719.
Das, et al., "System for Temporal Registration of Quasi-Phonemic Utterance Representations", Dec., 1980, IBM TDB, vol. 23, No. 7A, pp. 3047-3050.
Bakis et al., "Continuous Speech Recognition Via Centisecond Acoustic States", Apr. 1976, Research Report, pp. 1-8.
Bakis et al., "Spoken Word Spotting Via Centisecond Acoustic States", Mar., 1976, IBM TDB, vol. 18, No. 10, pp. 3479-3481.
Itakura, "Minimum Prediction Residual Principle Applied to Speech Recognition" Feb., 1975, IEEE, pp. 145-150.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Constructing Markov models of words from multiple utterances does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Constructing Markov models of words from multiple utterances, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Constructing Markov models of words from multiple utterances will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-602229

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.