Data processing: speech signal processing – linguistics – language – Speech signal processing – Synthesis
Patent
1997-03-31
2000-06-06
Dorvil, Richemond
Data processing: speech signal processing, linguistics, language
Speech signal processing
Synthesis
704207, G10L 900
Patent
active
060731001
ABSTRACT:
A method of synthesizing audio signals provides outputs of high subjective quality which retain the semblance of natural origin. Unlike frequency scaling methods, the pitch of a signal can be modified independently of the spectrum envelope. A set of candidate input sections is defined based on input transform-domain signal representations. A match-output transform-domain section is formed using the result of a matching process which compares candidate input sections to a reference section. The reference section for this matching process is defined based on one or more previously formed match-output sections. Main-output transform-domain signal representations are formed based on one or more match-output sections, whereby such main-output transform-domain signal representations can be inverse-transformed and combined with the output time-domain signal. This method is referred to as "Transform-Domain Match-Output Extension" (TDMOX). One embodiment of the invention implements block-transform processing using an FFT algorithm. Matching processes search over ranges of frequency shifts, ranges of time shifts, and ranges of resampling factors. Selections are based on maximum cross-correlation, maximum sum of dot products, and minimum sum of squared differences, respectively. Applications include text-to-speech synthesis, audio editing, musical effects processing, real-time low-delay voice transformation, internet telephony, voice mail, Karaoke, hearing aids, and film animation.
REFERENCES:
patent: 4464784 (1984-08-01), Agnello
patent: 4885790 (1989-12-01), McAvlay et al.
patent: 4991213 (1991-02-01), Wiilson
patent: 5012517 (1991-04-01), Wilson et al.
patent: 5175769 (1992-12-01), Hejna, Jr. et al.
patent: 5504833 (1996-04-01), George et al.
D. W. Griffin and J. S. Lim, "Signal Estimation from Modified Short-Time Fourier Transform," IEEE Transactions on Acoustics, Speech, and Signal Processing, Apr. 1984, vol. ASSP-32, No. 2, pp 236-243.
J. L. Flanagan and R. M. Golden, "Phase Vocoder," Bell System Technical Journal, Nov. 1996, vol. 45, pp 1493-1509.
D. W. Griffin and J. S. Lim, "A New Model-Based Speech Analysis/Synthesis System," proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Mar. 1985, vol. 2, pp 513-516.
R. E. Crochiere, "A Weighted Overlap-Add Method of Short-Time Fourier Analysis/Synthesis," IEEE Transactions on Acoustics, Speech, and Signal Processing, Feb. 1980, vol. ASSP-28, No. 1, pp 99-102.
J. Makhoul, "Linear Prediction: A Tutorial Review," Proceedings of the IEEE, Apr. 1975, vol. 63, pp 561-580.
S. Seneff, "System to Independently Modify Excitation and/or Spectrum of Speech Waveform Without Explicit Pitch Extraction," IEEE Transactions on Acoustics, Speech, and Signal Processing, Aug. 1982, vol. ASSP-30, No. 4, pp 566-578.
M. Abe, S. Tamura, and H. Kuwabara, "A New Speech Modification Method By Signal Reconstruction," proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Apr. 1989, pp 592-595.
T. E. Quatieri and R. J. McAulay, "Speech Transformations Based on a Sinusoidal Representation," proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Mar. 1985, vol. 2, pp 489-492.
T. E. Quatieri and R. J. McAulay, "Speech Transformations Based on a Sinusoidal Representation," IEEE Transactions on Acoustics, Speech, and Signal Processing, Dec. 1986, vol. ASSP-34, No. 6, pp 1449-1461.
W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, "Numerical Recipes in C, Second Edition," Cambridge University Press, 1992.
L. R. Rabiner and R. W. Schafer, "Digital Processing of Speech Signals," Prentice-Hall, 1978. Chapter 6.
M. Vetterli and J. Kovacevic, "Wavelets and Subband Coding," Prentice-Hall, 1995. Chapter 3.
W. B. Kleijn and K. K. Paliwal (Editors), "Speech Coding and Synthesis," Elsevier, 1995. Chapter 15: E. Moulines, W. Verhelst, "Time-Domain and Frequency-Domain Techniques for Prosodic Modification of Speech."
LandOfFree
Method and apparatus for synthesizing signals using transform-do does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for synthesizing signals using transform-do, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for synthesizing signals using transform-do will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2222970