Method and system of runtime acoustic unit selection for speech

Data processing: speech signal processing – linguistics – language – Speech signal processing – Synthesis

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

704256, G10L 502, G10L 900

Patent

active

059131934

ABSTRACT:
The present invention pertains to a concatenative speech synthesis system and method which produces a more natural sounding speech. The system provides for multiple instances of each acoustic unit which can be used to generate a speech waveform representing an linguistic expression. The multiple instances are formed during an analysis or training phase of the synthesis process and are limited to a robust representation of the highest probability instances. The provision of multiple instances enables the synthesizer to select the instance which closely resembles the desired instance thereby eliminating the need to alter the stored instance to match the desired instance. This in essence minimizes the spectral distortion between the boundaries of adjacent instances thereby producing more natural sounding speech.

REFERENCES:
patent: 4748670 (1988-05-01), Bahl et al.
patent: 4759068 (1988-07-01), Bahl et al.
patent: 4783803 (1988-11-01), Baker et al.
patent: 4817156 (1989-03-01), Bahl et al.
patent: 4829577 (1989-05-01), Kuroda et al.
patent: 4866778 (1989-09-01), Baker
patent: 5027406 (1991-06-01), Roberts et al.
patent: 5241619 (1993-08-01), Schwartz et al.
patent: 5349645 (1994-09-01), Zhao
patent: 5621859 (1997-04-01), Schwartz et al.
Nakajima et al., "Automatic Generation of Synthesis Units Based on Context Clustering" ICASSP '88: Acoustics, Speech &Signal Processing Conference, pp. 659-662.
Donovan, E., "Automatic Speech Synthesizer Parameter Estimation using HMMS" ICASSP '95:Acoustics, Speech & Signal Processing Conference, pp. 640-643.
Iwahashi, N. et al, "Concatenative Speech Synthesis by Minimum Distortion Criteria", ICASSP '92 :Acoustics, Speech & Signal Processing Conference, pp. II-65-II-68.
Bahl, et al., "A Maximum Likelihood Approach to Continuous Speech Recognition,"IEEE Transactions on Pattern Analysis and Machine Intelligence; 1983; pp. 308-319.
Lee, Kai-Fu, "Context-Dependent Phonetic Hidden Markov Models for Speaker-Independent Continuous Speech Recognition," IEEE Transactions on Acoustics, Speech and Signal Processing; Apr., 1990; pp. 347-362.
Huang, Xuedong et al., "An Overview of the SPHINX-II Speech Recognition System," Proceedings of ARPA Human Language Technology Workshop; 1993; pp. 1-6.
Huang, X.D., and M. A. Jack, "Semi-continuous hidden Markov models for speech signals," Computer Speech and Language, vol. 3, 1989; pp. 239-251.
Baker, James K., "Stochastic Modeling for Automatic Speech Understanding," Speech Recognition, Editor P.R. Reddy; pp. 297-307.
"1993 IEEE International Conference on Acoustics, Speech, and Signal Processing." ICASSP--93--Speech Processing Volume II of V, Minneapolis Convention Center; Apr. 27-30, 1993; pp. 311-314.
Gelsema et al. (Ed.), "Pattern Recognition in Practice," Proceedings of an International Workshop held in Amsterdam; May 21-23, 1980; pp. 381-402.
Rabiner, Lawerence, and Bing-Hwang Juang, "Fundamentals of Speech Recognition," Prentice Hall Publishers; 1993; Chapter 6; pp. 372-373.
Lee, Kai-Fu et al., "Automatic Speech Recognition--The Development of the SPHINX System," Kluwer Academic Publishers; 1989; pp. 51-62, and 118-126.
Huang, X.D. et al, "Hidden Markov Models for Speech Recognition," Edinburgh University Press; 1990; pp. 210-212.
"Developing NeXTSTEP.TM. Applications," SAMS Publishing; 1995; pp. 118-144.
Itoh et al., "Sub-Phonemic Optimal Path Search for Concatenative Speech Synthesis," Esca. Eurospeech '95 4th European Conference on Speech Communication and Technology, Madrid; Sep., 1995; pp. 577-580.
Rabiner et al., "High Performance Connected Digit Recognition Using Hidden Markov Models," Proceedings of ICASSP-88, 1988; pp. 320-330.
Moulines, Eric, and Francis Charpentier, "Pitch-Synchronous Waveform Processing Techniques for Text-To-Speech Synthesis Using Diphones," Speech Communications 9; 1990; pp. 453-467.
Breckenridge Pierrehumbert, Janet, "Pholology and Phonetics of English Intonation," Massachusetts Institute of Technology, Sep. 1980, pp. 1-401.
"Development of a Text-To-Speech System for Japanese Based on Waveform Splicing", by Hisashi Sawai et al., 1994 IEEE, pp. I-569-I-572.
"Speech Segment Selection for Concatenative Synthesis Based on Spectral Distortion Minimization", by Naoti Iwahashi et al., IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, 76 (a) 1993, Nov., No. 11, Tokyo, JP, pp. 1942-1948.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and system of runtime acoustic unit selection for speech does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and system of runtime acoustic unit selection for speech , we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system of runtime acoustic unit selection for speech will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-410371

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.