Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission
Patent
1997-10-14
1999-10-26
Hudspeth, David R.
Data processing: speech signal processing, linguistics, language
Speech signal processing
For storage or transmission
704201, G10L 302
Patent
active
059743778
DESCRIPTION:
BRIEF SUMMARY
BACKGROUND OF THE INVENTION
The present invention relates to analysis-by-synthesis speech coding.
The applicant company has particularly described such speech coders, which it has developed, in its European patent applications 0 195 487, 0 347 307 and 0 469 997.
In an analysis-by-synthesis speech coder, linear prediction of the speech signal is performed in order to obtain the coefficients of a short-term synthesis filter modelling the transfer function of the vocal tract. These coefficients are passed to the decoder, as well as parameters characterising an excitation to be applied to the short-term synthesis filter. In the majority of present-day coders, the longer-term correlations of the speech signal are also sought in order to characterise a long-term synthesis filter taking account of the pitch of the speech. When the signal is voiced, the excitation in fact includes a predictable component which can be represented by the past excitation, delayed by TP samples of the speech signal and subjected to a gain g.sub.p. The long-term synthesis filter, also reconstituted at the decoder, then has a transfer function of the form 1/B(z) with B(z)=1-g.sub.p.z.sup.-TP. The remaining, unpredictable part of the excitation is called stochastic excitation. In the coders known as CELP ("Code Excited Linear Prediction") coders, the stochastic excitation consists of a vector looked up in a predetermined dictionary. In the coders known as MPLPC ("Multi-Pulse Linear Prediction Coding") coders, the stochastic excitation includes a certain number of pulses the positions of which are sought by the coder. In general, CELP coders are preferred for low data transmission rates, but they are more complex to implement than MPLPC coders.
In order to determine the long-term prediction delay, a closed-loop analysis, an open-loop analysis or a combination of the two is used. The open-loop analysis is not demanding in terms of amount of calculation, but its accuracy is limited. Conversely, the closed-loop analysis requires much calculation, but it is more reliable as it contributes directly to minimising the perceptually weighted difference between the speech signal and the synthetic signal. In certain cases, an open-loop analysis is carried out first of all in order to limit the interval within which the closed-loop analyser will search for the prediction delay. This search interval must nevertheless remain relatively wide, since account has to be taken of the fact that that the delay may vary rapidly.
The invention aims particularly to find a good compromise between the quality of the modelling of the long-term part of the excitation and the complexity of the search for the corresponding delay in a speech coder.
SUMMARY OF THE INVENTION
The invention thus proposes an analysis-by-synthesis speech coding method for coding a speech signal digitised into successive frames which are divided into most sub-frames, comprising the following steps : linear prediction analysis of the speech signal in order to determine parameters of a short-term synthesis filter ; open-loop analysis of the speech signal in order to detect the voiced frames of the signal and in order, for each voiced frame, to determine a degree of voicing of the signal and an interval for searching for a long-term prediction delay ; closed-loop predictive analysis of the speech signal in order, for at least some of the sub-frames of the voiced frames, to select a long-term prediction delay contained in the search interval and constituting a parameter of a long-term synthesis filter ; and determination of a stochastic excitation for each sub-frame, so as to minimise a perceptually weighted difference between the speech signal and the stochastic excitation filtered by the long-term and short-term synthesis filters. In the open-loop analysis step, the search interval relating to each voiced frame is determined so that it contains a number of delays which is dependent on the degree of voicing of said frame.
Hence, the number of delays which are to be tested in closed-loop mode
REFERENCES:
patent: 4802171 (1989-01-01), Rasky
patent: 4831624 (1989-05-01), McLaughlin et al.
patent: 4868867 (1989-09-01), Davidson et al.
patent: 4964169 (1990-10-01), Ono
patent: 5060269 (1991-10-01), Zinser
patent: 5097507 (1992-03-01), Zinser et al.
patent: 5253269 (1993-10-01), Gerson et al.
patent: 5265219 (1993-11-01), Gerson et al.
patent: 5307441 (1994-04-01), Tzeng
patent: 5359696 (1994-10-01), Gerson et al.
patent: 5414796 (1995-05-01), Jacobs et al.
patent: 5596677 (1997-01-01), Jarvinen et al.
patent: 5657420 (1997-08-01), Jacobs et al.
patent: 5664055 (1997-09-01), Kroon
patent: 5699485 (1997-12-01), Shoham
patent: 5704002 (1997-12-01), Massaloux
patent: 5708757 (1998-01-01), Massaloux
patent: 5710863 (1998-01-01), Chen
patent: 5717824 (1998-02-01), Chhatwal
patent: 5717825 (1998-02-01), Lamblin
patent: 5727123 (1998-03-01), McDonough et al.
patent: 5729694 (1998-03-01), Holzrichter et al.
patent: 5732389 (1998-03-01), Kroon et al.
patent: 5751903 (1998-05-01), Swaminathan et al.
patent: 5778338 (1998-07-01), Jacobs et al.
patent: 5784532 (1998-07-01), McDonough et al.
patent: 5787390 (1998-07-01), Quinquis et al.
patent: 5790759 (1998-08-01), Chen
patent: 5828811 (1998-10-01), Taniguchi et al.
patent: 5828996 (1998-10-01), Iijima et al.
patent: 5845244 (1998-12-01), Proust
patent: 5848387 (1998-12-01), Nishiguchi et al.
Database INSPEC, Institute of Elect. Engineers, Stevenage, GB, Inspec No. 4917063 A. Kataoka et al, "Implementation and performance of an 8-kbit/s conjugate structure speech coder", Abstract.
IEEE Trans. on Acoustics, Speech and Signal Processing, vol. 37, No. 3, Mar. 1989, pp. 317-327, S. Singhal et al. "Amplitude Optimization and Pitch Prediction in Multipulse Coders".
Mauc Michel
Navarro William
Hudspeth David R.
Matra Communication
Opsasnick Michael N.
LandOfFree
Analysis-by-synthesis speech coding method with open-loop and cl does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Analysis-by-synthesis speech coding method with open-loop and cl, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Analysis-by-synthesis speech coding method with open-loop and cl will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-775116