Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission
Patent
1997-10-14
1999-05-04
Hudspeth, David R.
Data processing: speech signal processing, linguistics, language
Speech signal processing
For storage or transmission
704225, 704223, G10L 914
Patent
active
058999680
DESCRIPTION:
BRIEF SUMMARY
BACKGROUND OF THE INVENTION
The present invention relates to analysis-by-synthesis speech coding.
The applicant company has particularly described such speech coders, which it has developed, in its European patent applications 0 195 487, 0 347 307 and 0 469 997.
In an analysis-by-synthesis speech coder, linear prediction of the speech signal is performed in order to obtain the coefficients of a short-term synthesis filter modelling the transfer function of the vocal tract. These coefficients are passed to the decoder, as well as parameters characterising an excitation to be applied to the short-term synthesis filter. In the majority of present-day coders, the longer-term correlations of the speech signal are also sought in order to characterise a long-term synthesis filter taking account of the pitch of the speech. When the signal is voiced, the excitation in fact includes a predictable component which can be represented by the past excitation, delayed by TP samples of the speech signal and subjected to a gain g.sub.p. The long-term synthesis filter, also reconstituted at the decoder, then has a transfer function of the form 1/B(z) with B(z)=1-g.sub.p .multidot.z.sup.-TP. The remaining, unpredictable part of the excitation is called stochastic excitation. In the coders known as CELP ("Code Excited Linear Prediction") coders, the stochastic excitation consists of a vector looked up in a predetermined dictionary. In the coders known as MPLPC ("Multi-Pulse Linear Prediction Coding") coders, the stochastic excitation includes a certain number of pulses the positions of which are sought by the coder. In general, CELP coders are preferred for low data transmission rates, but they are more complex to implement than MPLPC coders.
One purpose of the present invention is to propose a method of speech coding in which the search for the stochastic excitation is simplified.
SUMMARY OF THE INVENTION
The invention thus proposes an analysis-by-synthesis speech coding method for coding a speech signal digitised into successive frames which are divided into sub-frames of 1st samples, in which a linear prediction analysis is performed for each frame in order to determine the coefficients of a short-term synthesis filter, and an excitation sequence is determined, for each sub-frame, with nc contributions each associated with a respective gain in such a way that the excitation sequence submitted to the short-term synthesis filter produces a synthetic signal representative of the speech signal, the nc contributions of the excitation sequence and the associated gains being determined by an iterative process in which the iteration n (0.ltoreq.n<nc) comprises: .multidot.e.sub.n-1.sup.T).sup.2 /F.sub.p .multidot.F.sub.p.sup.T), where F.sub.p designates a row vector with 1st components equal to the products of convolution between one possible value of the contribution n and the impulse response of a composite filter consisting of the short-term synthesis filter and of a perceptual weighting filter, and e.sub.n-1, designates a target vector determined during the iteration n-1 if n.gtoreq.1 and e.sub.-1 =x is an initial target vector; and g.sub.n (n)) by solving the linear system g.sub.n .multidot.B.sub.n =b.sub.n where B.sub.n is a symmetric matrix with n+1 rows and n+1 columns in which the component B.sub.n (i,j) (0.ltoreq.i, j.ltoreq.n) is equal to the scalar product F.sub.p(i) .multidot.F.sub.p(j).sup.T where F.sub.p(i) and F.sub.p(j) respectively designate the row vectors equal to the products of convolution between the previously determined contributions i and j and the impulse response of the composite filter, and b.sub.n is a row vector with n+1 components b.sub.n (i) (0.ltoreq.i .ltoreq.n) respectively equal to the scalar products between the vectors F.sub.p(i) and the initial target vector X, the nc gains associated with the nc contributions of the excitation sequence being those calculated during iteration nc-1. At each iteration n (0.ltoreq.n<nc), the rows n of three matrices L, R and K with nc rows and nc column
REFERENCES:
patent: 4802171 (1989-01-01), Rasky
patent: 4831624 (1989-05-01), McLaughlin et al.
patent: 4964169 (1990-10-01), Ono
patent: 5060269 (1991-10-01), Zinser
patent: 5097507 (1992-03-01), Zinser et al.
patent: 5253269 (1993-10-01), Gerson et al.
patent: 5265219 (1993-11-01), Gerson et al.
Database INSPEC, Institute of Elect. Engineers, Stevenage, GB, Inspec No. 4917063 A. Kataoka et al, "Implementation and performance of an 8-kbit/s conjugate structure speech coder", Abstract.
IEEE Trans. on Acoustics, Speech and Signal Processing, vol. 37, No. 3, Mar. 1989, pp. 317-327, S. Singhal et al, "Amplitude Optimization and Pitch Prediction in Multipulse Coders".
Mauc Michel
Navarro William
Hudspeth David R.
Matra Corporation
Zintel Harold
LandOfFree
Speech coding method using synthesis analysis using iterative ca does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Speech coding method using synthesis analysis using iterative ca, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Speech coding method using synthesis analysis using iterative ca will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-1866998