Data processing: speech signal processing – linguistics – language – Speech signal processing – Synthesis
Patent
1998-05-26
2000-08-08
Hudspeth, David R.
Data processing: speech signal processing, linguistics, language
Speech signal processing
Synthesis
704268, G10L 1308
Patent
active
061014702
ABSTRACT:
A method for automatically generating pitch contours in a text to speech (TtS) system, the system converting input text into an output acoustic signal simulating natural speech, the method comprising the steps of: storing a plurality of associated stress and pitch level pairs, each of the plurality of pairs including a lexical stress level and a pitch level; calculating lexical stress levels of the input text; comparing the stress levels of the input text to the stored stress levels of the plurality of associated stress and pitch level pairs to find the stored stress levels closest to the stress levels of the input text; and copying the pitch levels associated with the closest stored stress levels of the stress and pitch level pairs to generate the pitch contours of the input text. Features illustrative of various modes of the invention include stress and pitch level pairs that correspond with the end of vowels, use of a phonetic dictionary to expand words to phonemes and concatenate stress levels, blocking sentences and the stress contours into constant or variable lengths by segmenting from the ends toward the beginnings, and averaging at the block boundary. The method may distinguish among declarations, questions, and exclamations. Training text may be collected from more than one speaker and scaled; the speaker(s) may wear a laryngograph to provide vocal cord activity.
REFERENCES:
patent: 3704345 (1972-11-01), Coker et al.
patent: 4278838 (1981-07-01), Antonov
patent: 4908867 (1990-03-01), Silverman
patent: 5384893 (1995-01-01), Hutchins
patent: 5536171 (1996-07-01), Javkin et al.
patent: 5758320 (1998-05-01), Asano
patent: 5913193 (1999-06-01), Huang et al.
Xuedong Huang, A. Acero, J. Adcock, Hsiao-Wuen Hon, J. Goldsmith, Jingsong Liu, and M. Plumpe, "Whistler: A Trainable Text-to-Speech System," Proc. Fourth Int. Conf. Spoken Language, 1996. ICSLP 96, vol. 4, pp. 2387-2390, Oct.3-6, 1996.
Campbell et al., Stress, Prominence, and Spectral Tilt, ESCA Workshop on Intonation: Theory, Models and Applications, Athens Greece, Sep. 18-20, 1997, pp. 67-70.
Huang et al. Recent Improvements on Microsoft's Trainable Text-to-Speech System-Whistler, 1997 IEEE, pp. 959-962; ICASSP-97, Apr. 21-24.
Donovan et al., Improvements in an HMM-Based Synthesizer, ESCA Eurospeech '95.4th European Conference on Speech Communication and Technology, Madrid, Sep. 1995, pp. 573-576.
G. David Forney, Jr.; The Viterbi Algorithm, Proceedings of the IEEE, vol. 61, No. 3, Mar. 1973, pp. 268-278.
Donovan Robert E.
Eide Ellen M.
Hudspeth David R.
International Business Machines - Corporation
Storm Donald L.
LandOfFree
Methods for generating pitch and duration contours in a text to does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Methods for generating pitch and duration contours in a text to , we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Methods for generating pitch and duration contours in a text to will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-1159450