Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission
Reexamination Certificate
1998-12-21
2002-09-24
Korzuch, William (Department: 2741)
Data processing: speech signal processing, linguistics, language
Speech signal processing
For storage or transmission
C704S221000, C704S222000, C704S223000, C704S219000
Reexamination Certificate
active
06456964
ABSTRACT:
BACKGROUND OF THE INVENTION
I. Field of the Invention
The present invention relates to the coding of speech signals. Specifically, the present invention relates to coding quasi-periodic speech signals by quantizing only a prototypical portion of the signal.
II. Description of the Related Art
Many communication systems today transmit voice as a digital signal, particularly long distance and digital radio telephone applications. The performance of these systems depends, in part, on accurately representing the voice signal with a minimum number of bits. Transmitting speech simply by sampling and digitizing requires a data rate on the order of 64 kilobits per second (kbps) to achieve the speech quality of a conventional analog telephone. However, coding techniques are available that significantly reduce the data rate required for satisfactory speech reproduction.
The term “vocoder” typically refers to devices that compress voiced speech by extracting parameters based on a model of human speech generation. Vocoders include an encoder and a decoder. The encoder analyzes the incoming speech and extracts the relevant parameters. The decoder synthesizes the speech using the parameters that it receives from the encoder via a transmission channel. The speech signal is often divided into frames of data and block processed by the vocoder.
Vocoders built around linear-prediction-based time domain coding schemes far exceed in number all other types of coders. These techniques extract correlated elements from the speech signal and encode only the uncorrelated elements. The basic linear predictive filter predicts the current sample as a linear combination of past samples. An example of a coding algorithm of this particular class is described in the paper “A 4.8 kbps Code Excited Linear Predictive Coder,” by Thomas E. Tremain et al., Proceedings of the Mobile Satellite Conference, 1988.
These coding schemes compress the digitized speech signal into a low bit rate signal by removing all of the natural redundancies (i.e., correlated elements) inherent in speech. Speech typically exhibits short term redundancies resulting from the mechanical action of the lips and tongue, and long term redundancies resulting from the vibration of the vocal cords. Linear predictive schemes model these operations as filters, remove the redundancies, and then model the resulting residual signal as white gaussian noise. Linear predictive coders therefore achieve a reduced bit rate by transmitting filter coefficients and quantized noise rather than a full bandwidth speech signal.
However, even these reduced bit rates often exceed the available bandwidth where the speech signal must either propagate a long distance (e.g., ground to satellite) or coexist with many other signals in a crowded channel. A need therefore exists for an improved coding scheme which achieves a lower bit rate than linear predictive schemes.
SUMMARY OF THE INVENTION
The present invention is a novel and improved method and apparatus for coding a quasi-periodic speech signal. The speech signal is represented by a residual signal generated by filtering the speech signal with a Linear Predictive Coding (LPC) analysis filter. The residual signal is encoded by extracting a prototype period from a current frame of the residual signal. A first set of parameters is calculated which describes how to modify a previous prototype period to approximate the current prototype period. One or more codevectors are selected which, when summed, approximate the difference between the current prototype period and the modified previous prototype period. A second set of parameters describes these selected codevectors. The decoder synthesizes an output speech signal by reconstructing a current prototype period based on the first and second set of parameters. The residual signal is then interpolated over the region between the current reconstructed prototype period and a previous reconstructed prototype period. The decoder synthesizes output speech based on the interpolated residual signal.
A feature of the present invention is that prototype periods are used to represent and reconstruct the speech signal. Coding the prototype period rather than the entire speech signal reduces the required bit rate, which translates into higher capacity, greater range, and lower power requirements.
Another feature of the present invention is that a past prototype period is used as a predictor of the current prototype period. The difference between the current prototype period and an optimally rotated and scaled previous prototype period is encoded and transmitted, further reducing the required bit rate.
Still another feature of the present invention is that the residual signal is reconstructed at the decoder by interpolating between successive reconstructed prototype periods, based on a weighted average of the successive prototype periods and an average lag.
Another feature of the present invention is that a multi-stage codebook is used to encode the transmitted error vector. This codebook provides for the efficient storage and searching of code data. Additional stages may be added to achieve a desired level of accuracy.
Another feature of the present invention is that a warping filter is used to efficiently change the length of a first signal to match that of a second signal, where the coding operations require that the two signals be of the same length.
Yet another feature of the present invention is that prototype periods are extracted subject to a “cut-free” region, thereby avoiding discontinuities in the output due to splitting high energy regions along frame boundaries.
The features, objects, and advantages of the present invention will become more apparent from the detailed description set forth below when taken in conjunction with the drawings in which like reference numbers indicate identical or functionally similar elements. Additionally, the left-most digit of a reference number identifies the drawing in which the reference number first appears.
REFERENCES:
patent: 5517595 (1996-05-01), Kleijn
patent: 5734789 (1998-03-01), Swaminathan et al.
patent: 5809459 (1998-09-01), Bergstrom et al.
patent: 5884253 (1999-03-01), Kleijn
patent: 5903866 (1999-05-01), Shoham
patent: 6092039 (2000-07-01), Zingher
patent: 6233550 (2001-05-01), Gersho et al.
patent: 6260017 (2001-07-01), Das et al.
patent: 6324505 (2001-11-01), Choy et al.
patent: 6330532 (2001-12-01), Manjunath et al.
patent: 0666557 (1995-08-01), None
patent: 0865028 (1998-09-01), None
1978 Digital Processing of Speech Signals, “Linear Predictive Coding of Speech”, L.R. Rabiner et al., pp. 411-413.
1988 Proceedings of the Mobile Satellite Conference, “A 4.8 KBPS Code Excited Linear Predictive Coder”, T. Tremain et al., pp. 491-496.
1991 Digital Signal Processing, “Methods for Waveform Interpolation in Speech Coding”, W. Bastiaan Kleijn, et al., pp. 215-230.
Burnett, et al. “A Mixed Prototype Waveform/CELP Coder for Sub 3KB/S” Proceedings of the Int'l Conf. On Acoustics, Speech and Signal Processing 2: 175-178 (Apr. 1993).
Marston, et al. “PWI Speech Coder in the Speech Domain” IEEE Workshop on Speech Coding for Coding: pp. 31-32 (1997). Abstract only.
Yang, et al. “Voiced Speech Coding At Very Low Bit Rates Based on Forward_Backward Waveform Prediction (FBWP)” Proceedings of the Int'l Conf. On Acoustics, Speech and Signal Processing 2: 179-182 (1993).
Gardner William
Manjunath Sharath
Baker Kent D.
Korzuch William
Macek Kyong H.
McFadden Susan
Qualcomm Incorporated
LandOfFree
Encoding of periodic speech using prototype waveforms does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Encoding of periodic speech using prototype waveforms, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Encoding of periodic speech using prototype waveforms will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2849230