Speech coder for high quality at low bit rates

Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S230000, C704S222000

Reexamination Certificate

active

06751585

ABSTRACT:

BACKGROUND OF THE INVENTION
The present invention relates to speech coders and, more particularly, to speech coders for high quality coding of speech signals at low bit rates.
A speech coder is used together with a speech decoder such that the speech is coded by the coder and decoded in the speech decoder. A well known method of high efficiency speech coding is CELP (Code Excited Linear Prediction coding) as disclosed in, for instance, M. Schroeder, B. Atal et al, “Code-Excited Linear Prediction: High Quality Speech at very low bit rates”, IEEE Proc. ICASSP-85, 1985, pp. 937-940 (Reference 1) and Kleijn et al, “Improved Speech Quality and Efficient Vector Quantization in SELP”, IEEE Proc. ICASSP-88, 1988, pp. 155-158 (Reference 2). In this method, on the transmission side, a spectral parameter, representing a spectral energy distribution of a speech signal, is extracted from the speech signal for each frame (of 20 ms, for instance) by using linear prediction (LPC) analysis. Also, the frame is further divided into a plurality of sub-frames (of 5 ms, for instance), and parameters (i.e., delay parameter corresponding to pitch period and gain parameter) are extracted for each sub-frame on the basis of the past excitation signals. Then, pitch prediction of a pertinent sub-frame speech signal is executed by using an adaptive codebook. For an error signal which is obtained as a result of the pitch prediction, an optimum excitation codevector is selected from an excitation codebook (or vector quantization codebook) constituted by a predetermined kind of noise signal, whereby an optimal gain is calculated for excitation signal quantization. The optimal excitation codevector is selected so as to minimize the error power between a signal synthesized from the selected noise signal and the error signal noted above. Index and gain, representing the kind of the selected codevector, are transmitted together with the spectral parameter and adaptive codebook parameter to a multiplexer. Description of the receiving side is omitted.
In the above prior art speech coder, enormous computational effort is required for the selection of the optimal excitation codevector from the excitation codebook. This is so because in the method according to References 1 and 2 described above, the excitation codevector selection is executed by repeatedly performing, for each codevector, filtering or convolution a number of times corresponding to the number of the codevectors stored in the codebook. For example, where the bit number of the codebook is B and the dimension number is N, denoting the filter or impulse response length in the filtering or convolution by K, a computational effort of N×K×2B×8,000/N per second is required. By way of example, assuming B=10, N=40 and K=10, it is necessary to execute the computation 81,920,000 times per second. The computational effort is thus enormous and economically unfeasible.
Heretofore, various methods of reducing the computational effort necessary for the excitation codebook retrieval have been proposed. For example, an ACELP (Algebraic Code-Excited Linear Prediction) system has been proposed. The system is specifically treated in C. Laflamme et al, “16 kbps Wideband Speech Coding Technique based on Algebraic CELP”, IEEE Proc. ICASSP-91, 1991, pp. 13-16 (Reference 3). According to Reference 3, the excitation signal is expressed with a plurality of pulses, and transmitted with the position of each pulse represented with a predetermined number of bits. The amplitude of each pulse is limited to +1.0 or −1.0, and it is thus possible to greatly reduce the computational effort of the pulse retrieval.
The method according to Reference 3, however, has a problem that the speech quality is insufficient, although great reduction of computational effort is attainable. The problem stems from the fact that each pulse can take only either positive or negative polarity and that its absolute amplitude is always 1.0 irrespective of its position. This results in very coarse amplitude quantization, thus deteriorating the speech quality.
SUMMARY OF THE INVENTION
An object of the present invention is to provide a speech coder capable of preventing speech quality deterioration with relatively less computational effort where the bit rate is low.
According to the present invention, there is provided a speech coder comprising a spectral parameter calculation unit for obtaining a spectral parameter (i.e. spectral energy distribution) from an input speech signal and quantizing the obtained spectral parameter, an excitation quantization unit for quantizing an excitation signal of the speech signal by using the spectral parameter and outputting the quantized excitation signal, the excitation being constituted by a plurality of non-zero pulses. The speech coder further comprises a codebook for simultaneously quantizing one of two, i.e., amplitude and position, parameters of the non-zero pulses, the excitation quantization unit having a function of quantizing the non-zero pulses by obtaining the other parameter by retrieval of the codebook.
The excitation quantization unit has at least one specific pulse position for taking a pulse thereat.
The excitation quantization unit preliminarily selects a plurality of codevectors from the codebook and executes the quantization by obtaining the other parameter by retrieval of the preliminarily selected codevectors.
According to another embodiment of the present invention, there is provided a speech coder comprising a spectral parameter calculation unit for obtaining a spectral parameter from an input speech signal for every frame and quantizing the obtained spectral parameter, and an excitation quantization unit for quantizing an excitation signal of the speech signal by using the spectral parameter and outputting the quantized excitation signal. The excitation signal is constituted by a plurality of non-zero pulses. The speech coder further comprises a codebook for simultaneously quantizing the amplitude of the non-zero pulses and a mode judgment circuit for executing mode judgment by extracting a feature quantity from the speech signal. The excitation quantization unit provides, when a predetermined mode is determined as a result of the mode judgment in the mode judgment circuit, functions of a codevector and calculating positions of non-zero pulses for a plurality of sets, executing retrieval of the codebook with respect to the pulse positions in the plurality of sets and executing excitation signal quantization by selecting a combination of a codevector and pulse position, at which a predetermined equation has a maximum or a minimum value.
According to another embodiment of the present invention, there is provided a speech coder comprising a spectral parameter calculation unit for obtaining a spectral parameter from an input speech signal for every frame and quantizing the obtained spectral parameter, and an excitation quantization unit for quantizing an excitation signal of the speech signal by using the spectral parameter and outputting the quantized excitation signal. The excitation signal is constituted by a plurality of non-zero pulses. The speech coder further comprises a codebook for simultaneously quantizing the amplitude of the non-zero pulses and a mode judgment circuit for making a mode judgment by extracting a feature quantity from the speech signal. The excitation quantization unit provides, when a predetermined mode is recognized the excitation quantization unit, functions to calculate positions of non-zero pulses for at least one set, executing retrieval of the codebook with respect to pulse positions of a set having a pulse position, at which a predetermined equation has a maximum or a minimum value, and effects excitation signal quantization by selecting the optimal combination of satisfactory pulse position set and codevector. When a different predetermined mode is recognized, then the excitation quantization unit functions to represent the excitation in the form of linear coupling of a plur

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Speech coder for high quality at low bit rates does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Speech coder for high quality at low bit rates, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Speech coder for high quality at low bit rates will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3357719

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.