Electrical audio signal processing systems and devices – One-way audio signal program distribution – Public address system
Patent
1985-05-29
1988-07-19
Kemeny, Emanuel S.
Electrical audio signal processing systems and devices
One-way audio signal program distribution
Public address system
G10L 500
Patent
active
047590688
ABSTRACT:
Speech recognition is improved by splitting each feneme string at a consistent point into a left portion and a right portion. The present invention addresses the problem of constructing fenemic baseforms which take into account variations in pronunciation of words from one utterance thereof to another. Specifically, the invention relates to a method of constructing a fenemic baseform for a word in a vocabulary of word segments including the steps of: (a) transforming multiple utterances of the word into respective strings of fenemes; (b) defining a set of fenemic Markov model phone machines; (c) determining the best single phone machine P.sub.1 for producing the multiple feneme strings; (d) determining the best two phone baseform of the form P.sub.1 P.sub.2 or P.sub.2 P.sub.1 for producing the multiple feneme strings; (e) aligning the best two phone baseform against each feneme string; (f) splitting each feneme string into a left portion and a right portion with the left portion corresponding to the first phone machine of the two phone baseform and the right portion corresponding to the second phone machine of the two phone baseform; (g) identifying each left portion as a left substring and each right portion as a right substring; (h) processing the set of left substrings and the set of right substrings in the same manner as the set of feneme strings corresponding to the multiple utterances including the further step of inhibiting further splitting of a substring when the single phone baseform thereof has a higher probability of producing the substring than does the best two phone baseform; and (k) concatenating the unsplit single phones in an order corresponding to the order of the feneme substrings to which they correspond.
REFERENCES:
patent: 4038503 (1977-07-01), Moshier
patent: 4181821 (1980-01-01), Pirz et al.
patent: 4319085 (1982-03-01), Welch et al.
patent: 4348553 (1982-09-01), Baker et al.
patent: 4481593 (1984-11-01), Bahler
patent: 4513436 (1985-05-01), Nose et al.
patent: 4587670 (1986-05-01), Levinson et al.
patent: 4590605 (1986-05-01), Hataoka et al.
patent: 4593367 (1986-06-01), Slack et al.
patent: 4618983 (1986-10-01), Nishioka et al.
M. Cravero et al. "Phonetic Units for Hidden Markov Models", CSELT Technical Reports, vol. XIV No. 2 Apr. 1986, pp. 121-125.
L. R. Rabiner et al., "Recent Developments in the Application of Hidden Markov Models to Speaker-Independent Isolated Word Recognition", AT&T 1985 article, p. 1214.
H. Boulard et al., "Speaker Dependent Connected Speech Recognition Via Phonemic Markov Models", 1985 IEEE, pp. 1213-1216.
Douglas E. Paul et al., "Training of HMM Recognizers by Simulated Annealing", 1985, IEEE, pp. 13-16.
Yves Kamp et al., "State Reduction in Hidden Markov Chains Used for Speech Recognition", 1985, IEEE, pp. 1138-1145.
"Isolated Word Recognition Using Hidden Markov Models", K. Sugawara, 1985, IEEE, pp. 1-4.
R. Schwartz, "Context-Dependent Modeling for Acoustic-Phonetic Recognition of Continuous Speech", 1985, IEEE, pp. 1205-1208.
J. F. Mari et al., "Speaker Independent Connected Digit Recognition Using Hidden Markov Models", 1985, Speech Tech, pp. 127-132.
R. Schwartz et al., "Improved Hidden Markov Modeling of Phonemes for Continuous Speech Recognition", 1984, IEEE, pp. 35.6.1-35.6.4.
S. E. Levinson et al., "Speaker Independent Isolated Digit Recognition Using Hidden Markov Models", 1983, IEEE, pp. 1049-1052.
Jean-Paul Haton et al., "Problems in the Design and Use of a Connected Speech Understanding System", 1982, IEEE, pp. 1616-1620.
D. M. Choy et al., "Speech Compression by Phoneme Recognition", 1982, IBM TDB, vol. 25, No. 6, pp. 2884-2886.
Bahl, et al., "Interpolation of Estimators Derived from Sparse Data", 1981, IBM TDB, vol. 24, No. 4, pp. 2038-2041.
Bahl, et al., "Faster Acoustic Match Computation", 1980, IBM TDB, vol. 23, No. 4, pp. 1718-1719.
Das, et al., "System for Temporal Registration of Quasi-Phonemic Utterance Representations", Dec., 1980, IBM TDB, vol. 23, No. 7A, pp. 3047-3050.
Bakis et al., "Continuous Speech Recognition Via Centisecond Acoustic States", Apr. 1976, Research Report, pp. 1-8.
Bakis et al., "Spoken Word Spotting Via Centisecond Acoustic States", Mar., 1976, IBM TDB, vol. 18, No. 10, pp. 3479-3481.
Itakura, "Minimum Prediction Residual Principle Applied to Speech Recognition" Feb., 1975, IEEE, pp. 145-150.
Bahl Lalit R.
deSouza Peter V.
Mercer Robert L.
Picheny Michael A.
Block Marc A.
International Business Machines - Corporation
Kemeny Emanuel S.
LandOfFree
Constructing Markov models of words from multiple utterances does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Constructing Markov models of words from multiple utterances, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Constructing Markov models of words from multiple utterances will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-602229