Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission
Reissue Patent
1999-10-21
2003-10-07
Chawan, Vijay (Department: 2654)
Data processing: speech signal processing, linguistics, language
Speech signal processing
For storage or transmission
C704S214000, C704S213000, C704S208000, C704S219000, C704S223000
Reissue Patent
active
RE038269
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates to enhanced speech coding techniques for low-rate speech coders, and particularly, to improved speech frame analysis and vector quantization methods.
BACKGROUND OF THE INVENTION
A low-bit-rate speech coder is disclosed in U.S. Pat. No. 4,975,956, issued to Y. J. Liu and J. H. Rothweiler, entitled “Low-Bit-Rate Speech Coder Using LPC Data Reduction Processing”, which is incorporated herein by reference. This speech coder employs linear predictive coding (LPC) analysis to generate reflection coefficients for the input speech frames and pitch and gain parameters. To obtain a low bit rate of 400 bps, these parameters are further compressed. The reflection coefficients are first converted to line spectrum frequencies (LSFs) and formants. For even frames, these spectral parameters are vector-quantized into clean codeword indices. Odd frames are omitted, and are regenerated by interpolation at the decoder end. The vector quantization module compares the spectral parameters for an input word against a vocabulary of codewords for which vector indices have been generated and stored during a training sequence, and the optimally matching codeword is selected for transmission. Pitch and gain bits are quantized using trellis coding. Output speech is reconstructed from the regenerated vector-quantization indices using a matching codebook at the decoder end.
In a quiet background, this 400-bps speech coder has a high intelligibility for a low-bit-rate transmission. However, in a background of high noise, such as in a helicopter or jet, the encoded speech becomes unintelligible. A detailed study has shown that conversion of voicing and spectral parameters in the high-noise environment is the key to the loss of intelligibility. The LPC conversion causes a majority of voiced frames to become unvoiced. The result is a whispering LPC speech and an almost inaudible low-rate voice. Even if the voicing is correct, spectral distortion causes the low-rate voice to be significantly muffled and buzzy. Although the pitch has no audible errors, the gain has a predominantly annoying effect.
SUMMARY OF INVENTION
It is therefore a principal object of the invention to provide an improved low-bit-rate speech coder capable of high quality speech coding in a high-noise environment. In accordance with the invention, a two-step approach to conversion of voicing and spectral parameters is taken. In the first step, robust speech frame features whose distributions are not strongly affected by noise levels are generated. In the second step, linear programming is used to determine an optimum combination of these features. A technique of adaptive vector quantization is also used in which a clean codebook is updated based upon an estimate of the background noise levels, and the “noisy” codebook is then searched for the best match with an input speech vector. The corresponding clean codeword is then selected for transmission and for synthesis at the receiver end. The results are better spectral reproduction and significant intelligibility enhancement over the previous coding approach.
In a preferred implementation of the system for the environment of helicopter, it is found that the following features are well distributed to allow good discrimination between voiced and unvoiced speech: (1) low-band energy; (2) zero-crossing counts adapted for noise level; (3) AMDF ratio (speech periodicity) measure; (4) low-pass filtered, backward correlation; (5) low-pass filtered, forward correlation; (6) inverse-filtered backward correlation; and (7) inverse-filtered pitch prediction gain measure. By linear programming analysis, five of these robust features are determined to significantly improve voicing decisions in the speech coder system. Adaptive vector quantization, using estimates of the average noise amplitude and average noise reflection coefficients to update codebook vectors, significantly improves input vector matching.
REFERENCES:
patent: 4074069 (1978-02-01), Tokura et al.
patent: 4091237 (1978-05-01), Wolnowsky et al.
patent: 4296279 (1981-10-01), Stork
patent: 4589131 (1986-05-01), Horvath et al.
patent: 4630304 (1986-12-01), Borth et al.
patent: 4696038 (1987-09-01), Doddington et al.
patent: 4720802 (1988-01-01), Damoulakis et al.
patent: 4933973 (1990-06-01), Porter
patent: 4975956 (1990-12-01), Liu et al.
patent: 5073940 (1991-12-01), Zinser et al.
patent: 5127053 (1992-06-01), Koch
patent: 5459814 (1995-10-01), Gupta et al.
patent: 5806024 (1998-09-01), Ozawa
patent: 6018707 (2000-01-01), Nishiguchi et al.
patent: 6081776 (2000-06-01), Grabb et al.
Rabiner et al., (“Digital Processing of Speech Signals,” Prentice Hall, Upper Saddle River, NJ, pp. 130-133, 451-452, Dec. 1978).*
Deller et al., (“Discrete-Time Processing of Speech Signals,” Prentice Hall, Upper Saddle River, NJ, pp. 244-251, 471-473, Dec. 1987).*
Wolfgang Hess, (“Pitch Determination of Speech Signals”, pp. 373-383, Springer-Verlag, NY, 1983).*
Siegel, (“A Procedure for Using Pattern Classification Techniques to Obtain a Voiced/Unvoiced Classifier”, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-27, No. 1, Feb. 1979).*
Chen et al., “Robust Vector Quantization Based on Spectral Mapping with a Noise Estimate”, IEEE Pacific RIM Conference on Communications, Conference date: Jun. 1, 1989.
Kou et al., “Vector-Adaptive Vector Quantization with Application to Speech Coding”, IEEE Transactions on Communications, vol. 39, No. 6, Jun. 1991.
Chawan Vijay
Hale and Dorr LLP
ITT Manufacturing Enterprises Inc.
LandOfFree
Enhancement of speech coding in background noise for... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Enhancement of speech coding in background noise for..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Enhancement of speech coding in background noise for... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3152137