Speech coding/decoding method and apparatus

Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Speech coding/decoding method and apparatus Speech coding/decoding method and apparatus

: 2003-05-02
: 2004-07-27
: Dorvil, Richemond (Department: 2654)
: Data processing: speech signal processing, linguistics, language
: Speech signal processing
: For storage or transmission

: C704S223000
: Reexamination Certificate
: active
: 06768978
: ABSTRACT:

BACKGROUND OF THE INVENTION
The present invention relates to a low rate speech coding/decoding method used for digital telephones, voice memories, and the like.
Recently, as a coding technology used for portable telephones, the internet, and the like to compress speech information and audio information to small information amounts and transmit or store them, the CELP (Code Excited Linear Prediction (M. R. Schroeder and B. S. Atal, “Code Excited Linear Prediction (CELP): High Quality Speech at Very Low Bit Rates,” Proc. ICASSP, pp. 937-940, 1985 (reference 1)) scheme has been often used.
The CELP scheme is a coding scheme based on linear predictive analysis, in which an input speech signal is separated into linear predictive coefficients representing phoneme information and a prediction residual signal representing characteristic such as pitch period of a speech by linear predictive analysis. A digital filter called a synthesis filter is formed on the basis of the linear predictive coefficients. The original input speech signal can be reconstructed by inputting the prediction residual signal as an excitation signal to the synthesis filter. For low bit rate speech coding, these linear predictive coefficients and prediction residual signal must be coded with a small number of bits.
In the CELP scheme, a signal obtained by coding a prediction residual signal is generated as an excitation signal by adding the products of two types of vectors, i.e., a pitch vector and a stochastic vector, and gains.
A stochastic vector is generally generated by searching for an optimal candidate from a codebook in which many candidates are stored. This search uses a method of generating synthesized speech signals by filtering all the stochastic vectors through the synthesis filter together with pitch vectors, and selecting a stochastic vector with which a synthesized speech signal such that an error between the synthesized speech signal and the input speech signal is minimum is generated. It is therefore an important point for the CELP scheme to efficiently store stochastic vectors in the codebook.
As a scheme for satisfying such a requirement, pulse excitation expressing a stochastic vector by a train of several pulses is known. An example of this scheme is the multi-pulse scheme disclosed in reference 2 (K. Ozawa and T. Araseki, “Low Bit Rate Multi-pulse Speech Coder with Natural Speech Quality,” IEEE Proc. ICASSP'86, pp. 457-460, 1986).
An Algebraic codebook (J-P. Adoul et al, “Fast CELP coding based on algebraic codes”, Proc. ICASSP'87, pp. 1957-1960 (reference 3) is another example and has a simple structure in which a stochastic vector is expressed by only the presence/absence of a pulse and polarity (+, −). In spite of the limitation that the amplitude of a pulse is 1, unlike a multi-pulse, this technique is widely used for low rate coding because speech quality does not deteriorate much and a fast search method is proposed. As a scheme using an algebraic codebook, an improved scheme of allowing a pulse to have an amplitude has been proposed as disclosed in reference 4 (Chang Deyuan, “An 8 kb/s low complexity CELP speech codec,” 1996 3rd International Conference on Signal Processing, pp. 671-4, 1996).
In each type of pulse excitation described above, pulse position candidates at which pulses are set are limited to integer sampling positions, i.e., sampling points of a stochastic vector. For this reason, even if an attempt is made to improve the performance of a stochastic vector by increasing the number of bits assigned to pulse position candidates, bits cannot be assigned beyond the number of bits required to express the number of samples contained in a frame.
Even in a case wherein adapting of pulse position candidates which is provided by U.S. patent application Ser. No. 09/220,062 is to be performed, if the number of bits expressing position information is large, pulse position candidates are set for most samples even at a section where pulse position candidates should be dispersed. As a consequence, this section is difficult to discriminate from a section on which pulse position candidates are concentrated, resulting in a poor adapting effect.
BRIEF SUMMARY OF THE INVENTION
It is an object of the present invention to provide a speech coding/decoding method which can assign an arbitrary number of bits to pulse position information regardless of the number of samples in a frame which is a length of an excitation signal generated based on the pulse position, and can improve sound quality.
It is an object of the present invention to provide a speech coding/decoding method which can resolve an saturation phenomenon occurred when a pulse position is fixed at an integer position using a method of adapting a pulse position candidate which is provided by U.S. patent application Ser. No. 09/220,062, the content of which is incorporated herein by reference, and improve a speech quality by making effectively function adapting of the pulse position candidate.
According to the invention, there is provided a speech coding method which comprises: analyzing an input speech signal to divide the input speech signal into a parameter representing a frequency characteristic of a speech and an excitation signal which is an input signal of a synthesis filter generated based on the parameter, to output a first index specifying the parameter representing the frequency characteristic as a coded result, the excitation signal being formed of a pulse train including a pulse selected from first pulses and second pulses, the first pulses being set at first positions located on sampling points of the excitation signal and the second pulses being set at second positions located between sampling points of the excitation signal; generating a synthesized speech signal based on the coded result and the excitation signal; generating a second index indicating a parameter with which an error between the input speech signal and the synthesized speech signal is minimized; selecting a pulse position candidate from a pulse position codebook in accordance with the second index; and outputting the first and second indexes.
According to the invention, there is provided a speech decoding method which comprises: extracting, from a coded stream, a first index indicting a frequency characteristic of a speech, a second index indicating a pitch vector, and a third index indicating a pulse train of an excitation signal; reconstructing a synthesis filter by decoding the first index; reconstructing the pitch vector on the basis of the second index; reconstructing on the basis of the third index the excitation signal formed by using a pulse train including a pulse selected from first pulses and second pulses, the first pulses being set on sampling points of the excitation signal and the second pulses being set at positions located between sampling points of the excitation signal; and generating a decoded speech signal by exciting a synthesis filter by means of the reconstructed excitation signal and pitch vector.
In other words, the present invention provides a speech coding/decoding method in which an excitation signal is formed by using a pulse train, and the pulse train contains a pulse selected from first pulses set on sampling points of the excitation signal and second pulses set at positions located between sampling points of the excitation signal.
According to the invention, there is provided a speech coding method which comprises: analyzing an input speech signal to divide the input speech signal into a parameter representing a frequency characteristic of a speech and an excitation signal formed based on the parameter and input to a digital filter, to output a first index specifying the parameter representing the frequency characteristic as a coded result, the excitation signal being generated by using a pitch vector and a stochastic vector for exciting a synthesis filter; generating the stochastic vector by using a pulse train including a pulse selected from first pulses and second pulses, the first pulses being s

Affiliated with

Amada Tadashi

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Tsuchiya Katsumi

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Kabushiki Kaisha Toshiba

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Speech coding/decoding method and apparatus does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Speech coding/decoding method and apparatus, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Speech coding/decoding method and apparatus will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-3227535

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure