Data processing: speech signal processing – linguistics – language – Speech signal processing – Synthesis
Patent
1993-02-18
1998-10-20
MacDonald, Allen R.
Data processing: speech signal processing, linguistics, language
Speech signal processing
Synthesis
704258, 704265, G10L 900
Patent
active
058262326
DESCRIPTION:
BRIEF SUMMARY
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a method for voice synthesis.
2. Discussion of the Background
Among the numerous fields of application of voice synthesis, some, such as interactive control appliances (control of vehicles, of industrial processes, etc.) require only to synthesize simple messages (isolated words or predetermined phrases). In such applications, it is sought to minimise the cost of the voice synthesis device. The reduction in cost may be brought about essentially by using mass production circuits and by reducing the memory capacity necessary for storing the messages.
In order to reduce this memory capacity, the prior art calls on various types of coding. Among the most widely used codings, time coding is known, which associates a binary code at discrete instants with the amplitude of the signal, and, more precisely, the difference between the signal and its predictable component (differential coding) instead is stored in memory. Recourse is also had to coding the speech by analysis and synthesis, according to which only a very few significant parameters are stored (devices known as: "channel vocoder" or "linear prediction vocoder"). Finally, a method is known which results from the combination of the two above-mentioned methods: "adaptive predictive vocoder" or "voice excitation vocoder", in particular coding in sub-bands.
In the case of sub-band coding, which is coding in the frequency domain, the spectrum of the signal to be coded is broken up into a certain number of sub-bands of width B.sub.k (equal to each other or otherwise). Each sub-band (of index k) is next resampled at the Shannon frequency, i.e. 2B.sub.k. The signals leaving each sub-band filter are quantified differently on the basis of frequency, namely fine quantization for the fundamental and the formants, and coarse quantization in the regions where the energy is low. The reverse operation is carried out to reconstruct the signal.
Before storage and transmission, the signals are coded, for example, according to a PCM (pulse code modulation) coding law, normalized to 64 kbits/s (signal sampled at 8 kHz over 8 bits in the 300-3600 Hz band and compressed according to a logarithmic law). ADPCM coding (adaptive differential PCM), at a rate of 32 kbits/s (8 kHz over 4 bits), is becoming widespread.
In FIG. 1 is represented the theoretical diagram of a coding device 1 with two sub-bands. The speech signal x is filtered by two filters F1, F2 (with pulse responses h1, h2). Each of the two output sub-bands of F1, F2 is decimated by 2 (suppression of one sample in 2) by the circuits 2, 3 respectively, then coded (4), for example in ADPCM and stored (or transmitted). On reading (or reception), the reconstitution of the speech signal is done by decoding (5, 6) then filtering in interpolators (7, 8) which are identical to those of the corresponding analysis and summation band (9) for the two decoded sub-bands. The filters F1 and F2 are linear-phased FIR (finite impulse response) filters, and satisfy the following conditions. (e.sup.j.theta.).vertline..sup.2 =1
The template of these filters has been shown it FIG. 2.
The principle of sub-band coding consists in filtering the speech signal via a bank of filters, then in sub-sampling the output signals from these filters. On reception, reconstitution is done by addition of each decoded sub-band, interpolated by a filter identical to that of the corresponding analysis band. This type of coding was first introduced on the basis of separate and contiguous finite impulse response filters. It was then extended by virtue of the use of quadrature mirror filters, allowing near-perfect reconstitution of the initial signal in the absence of error in quantization.
Two large families of methods exist for synthesising the filters which break down the speech signal: the algorithm is renewed for each band; this case, the basic filter response is h(n) and the band width .pi./2M (M being the number of sub-bands). By displacement is obtained:
.pi. being the
REFERENCES:
patent: 4384169 (1983-05-01), Mozer et al.
patent: 4398059 (1983-08-01), Lin et al.
patent: 4520499 (1985-05-01), Montlick et al.
patent: 4599567 (1986-07-01), Goupillaud et al.
patent: 4817161 (1989-03-01), Kaneko
patent: 4974187 (1990-11-01), Lawton
patent: 5086475 (1992-02-01), Kutaragi et al.
Daubechies, I., Orthonormal Bases of Compactly Supported Wavelets, 1988, pp. 909-996.
Kronland-Martinet, R., The Wavelet Transform for Analysis, Synthesis and Processing, 1988, pp. 11-20.
Computer Music Journal, vol. 12, No. 4, Jan. 1, 1988, Cambridge, Massachusetts; R. Kronland-Martinet: "The Wavelet Transform For Analysis, Synthesis, And Processing Of Speech And Music Sounds", pp. 11-20.
Communications On Pure and Applied Mathematics, vol. XLI, 1988, I. Daubechies: "Orthonormal Bases Of Compactly Supported Wavelets", pp. 909-996.
International Journal on Pattern Recognition and Artificial Intelligence, vol. 1, No. 2, 1987, R. Kronland-Martinet et al: "Analysis Of Sound Patterns Through Wavelet Tranforms", pp. 273-302.
Traitement du Signal, vol. 7, No. 2, 1990, P. Mathieu: "Compression d'Image Par Transformee En Ondelette Et Quantification Vectorielle", pp. 101-115.
International Conference on Acoustics Speech and Signal Processing, vol. 3, Apr. 3, 1990, Albuquerque, New Mexico, USA, M. Vetterli et al: "Wavelets And Filter Banks: Relationships And New Results", pp. 1723-1726.
International Conference on Acoustics Speech and Signal Processing, vol. 2, Apr. 6, 1987, Dallas, Texas, USA, J.S. Lienard: "Speech Analysis And Reconstruction Using Short-Time, Elementary Waveforms", pp. 948-951.
Chawan Vijay B.
MacDonald Allen R.
Sextant Avionique
LandOfFree
Method for voice analysis and synthesis using wavelets does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method for voice analysis and synthesis using wavelets, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for voice analysis and synthesis using wavelets will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-259837