Speech synthesis using concatenation of speech waveforms

Data processing: speech signal processing – linguistics – language – Speech signal processing – Synthesis

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S260000

Reexamination Certificate

active

10724659

ABSTRACT:
A high quality speech synthesizer in various embodiments concatenates speech waveforms referenced by a large speech database. Speech quality is further improved by speech unit selection and concatenation smoothing.

REFERENCES:
patent: 5153913 (1992-10-01), Kandefer et al.
patent: 5384893 (1995-01-01), Hutchins
patent: 5479564 (1995-12-01), Vogten et al.
patent: 5490234 (1996-02-01), Narayan
patent: 5611002 (1997-03-01), Vogten et al.
patent: 5630013 (1997-05-01), Suzuki et al.
patent: 5749064 (1998-05-01), Pawate et al.
patent: 5774854 (1998-06-01), Sharman
patent: 5913193 (1999-06-01), Huang et al.
patent: 5920840 (1999-07-01), Satyamurti et al.
patent: 5978764 (1999-11-01), Lowry et al.
Black, Alan W, et al, “CHATR: a genetic speech synthesis system”, In Proceedings of Coling, 94 Kyoto, Japan.
Campbell, Nick, “Processing a Speech Corpus for Synthesis with Chatr”, ICSP '97 (International Conference on Speech Processing), Seoul, Korea Aug. 26, 1997.
Banga, Eduardo R., et al, “Shape-Invariant Pitch-Synchronous Text-to-Speech Conversion”, Proceedings of the International Conference on Acoustic, Speech, and Signal Processing (ICASSP), IEEE, 1995, pp. 656-659.
Black, Alan W., et al. “Automatically Clustering Similar Units for Unit Selection in Speech Synthesis”, Proceedings of Eurospeech 97, Sep. 1997, pp. 601-604, Rhodes, Greece.
Black, Alan W., et al “Optimising Selection of Units from Speech Databases for Concatenative Synthesis”, European Conference on Speech Communication and Technology, Madrid, Sep. 1995, pp. 581-584.
Campbell, Nick, et al “Chatr: A Natural Speech Re-Sequencing Synthesis System”.
Charpentier, F. J., et al “Diphone Synthesis Using an Overlap-Add Technique for Speech Wavefoms Concatenation”, IEEE, 1986, pp. 2015-2018.
Conkie, Alistair D. “Optimal Coupling of Diphones”, in J.P.H. van Santen, et al , editors, Progress in Speech Synthesis, Springer verlag, 1997, pp. 293-304.
Ding, Wen, et al “Optimising Unit Selection with Voice Source and Formats in the Chatr Speech Synthesis System”, Proceedings of Eurospeech 97, Sep. 1997, pp. 537-540, Rhodes, Greece.
Dutoit, T., “High Quality Test-to-Speech Synthesis: A Comparison of Four Candidate Algorithms”, IEEE, 1994, pp. I-565-I-568.
Edgington, M., “Investigating the Limitations of Concatenative Synthesis”, Eurospeech, 1997, pp. 1-4.
Edgington, M., et al, “Overview of Current Text-to-Speech Techniques: Part II—Prosody and Speech Generation”, BT Technology Journal, vol. 14, No. 1, Jan. 1996, pp. 84-99.
Hamdy, Khaled N., et al “Time-Scale Modification of Audio Signals with Combined Harmonic and Wavelet Representations”, Proceedings of ICASSP 97, pp. 439-442, Munich, Germany.
Hauptmann, Alexander G. “Speakez: A First Experiment in Concatenation Synthesis from a Large Corpus”, Proceedings of Eurospeech93, Sep. 1993, pp. 1701-1705, Berlin, Germany.
Hess, Wolfgang J. “Speech Synthesis—A Solved Problem?”, Signal Processing, Elsevier Science Publishers B.V., 1992.
Hirokawa, Tomohisa, et al, “High Quality Speech Synthesis System Based on Waveform Concatenation of Phoneme Segment”, IEICE Trans. Fundamentals, vol. E76-A, No. 11, Nov. 1993, pp. 1964-1970.
Huang, X., et al “Recent Improvements on Microsoft's Trainable Text-to-Speech System—Whistler”, Proceedings of ICASSP '97, Apr. 1997, pp. 959-962, Munich, Germany.
Hunt, Andrew J., et al “Unit Selection in a Concatenative Speech Synthesis System Using a Large Speech Database” , IEEE International Conference on Acoustics, Speech and Signal Processing Conference Proceedings, May 1996, vol. 1, pp. 373-376.
Iwahashi, Naoto, et al “Concatenative Speech Synthesis by Minimum Distortion Criteria”, IEEE, 1992, pp. II-65-II-68.
Iwahashi, Naoto, et al “Speech Segment Network Approach for Optimization of Synthesis Unit Set”, Computer Speech and Language, 1995, pp. 335-352.
King, Simon, et al “Speech Synthesis Using Non-Uniform Units in the Verbmobil Project”, Proceedings of Eurospeech '97, Europress, 97, Sep. 1997, pp. 569-572, Rhodes, Greece.
Klatt, Dennis H., “Review of Text-to-Speech Conversion for English”, Journal of Acoustic Society of America, 82 (3) Sep. 1987, pp. 737-793.
Lee, Sungjoo, et al “Variable Time-Scale Modification of Speech Using Transient Information”, Proceedings of ICASSP '97, Apr. 1997, pp. 1319-1322, Munich, Germany.
Lin, Gang-Janp, et al “High Quality of Low Complexity Pitch Modification of Acoustic Signals”, IEEE, 1995, pp. 2987-2990.
Kraft, Volker, “Does the Resulting Speech Quality Improvement Make a Sophisticated Concatenation of Time-Domain Synthesis Units Worthwhile?”, Proc. 2.sup.nd ESCA/IEEE Workshop on Speech Synthesis, 1994, pp. 65-68.
Laroche, Jean, et al, “HNS: Speech Modification Based on a Harmonic+Noise Model”,IEEE, 1993, pp. II-550-II-553.
Moulines, E., et al, “A Real-Time French Text-to-Speech System Generating High-Quality Synthetic Speech”, International Conference on Acoustics, Speech & Signal Processing, ICASSP, IEEE, 1990, vol. 15, pp. 309-312.
Nakajima, Shin'ya, “Automatic Synthesis Unit Generation for English Speech Synthesis Based on Multi-Layered Context Oriented Clustering”, Speech Communication, vol. 14, 1994, pp. 313-324.
Portele, Thomas, et al, “A Mixed Inventory Structure for German Concatenative Synthesis”, Progress in Speech Synthesis, J.P.H. van Santen, et al, editors, Springer verlag, 1997, pp. 263-277.
Quartieri, T.F., et al “Time-Scale Modification of Complex Acoustic Signals”, IEEE, 1993, pp. I-213-216.
Rudnicky, Alexander, I., et al, “Survey of Current Speech Technology”, Communication of the ACM, vol. 37, No. 3, Mar. 1994, pp. 52-57.
Sagisaka, Yoshinori, “Speech Synthesis by Rule Using an Optimal Selection of Non-Uniform Synthesis Units”, IEEE, 1998, pp. 679-682.
Saito, Takashi, et al, “High-Quality Speech Synthesis Using Context-Dependent Syllabic Units”, Proceedings of ICASSP '96, May 1996, pp. 381-384, Atlanta, Georgia.
Verhelst, Werner, et al, “An Overlap-Add Technique Based on Waveform Similiarity (WSOLA) for High Qualtiy Time-Scale Modification of Speech”, IEEE, 1993, pp. II-554-II-557.
Yim, S., et al, “Computationally Efficient Algorithm for Time Scale Modification GLS-TSM”, Proceedings of ICASSP '96, May 1996, pp. 1009-1012, Atlanta, Georgia.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Speech synthesis using concatenation of speech waveforms does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Speech synthesis using concatenation of speech waveforms, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Speech synthesis using concatenation of speech waveforms will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3752334

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.