Efficient quantization of speech spectral amplitudes based...

Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S207000, C704S230000

Reexamination Certificate

active

06377914

ABSTRACT:

BACKGROUND OF THE INVENTION
The present invention is directed to low bit rate (4.8 kb/s and below) speech coding, and particularly to a robust and efficient quantization scheme for use in such coding.
The number of harmonic magnitudes that must be quantized and transmitted for a given speech frame is a function of the estimated pitch period. This figure can vary from 8 harmonics in the case of high pitched speaker to as much as 80 for an extremely low pitched speaker. For the ITU 4 kb/s toll quality speech coding algorithm, there are only 80 bits available to quantize the whole speech model parameters (LSF coefficients, Pitch, Voicing information, and Spectral Amplitudes or Harmonic Magnitudes). For this purpose, only 21 bits are available to quantize 2 sets of spectral amplitudes (2 frames). Straightforward quantization schemes do not provide enough degree of transmission efficiency with the desired performance. Efficient quantization of the variable dimension spectral vectors is a crucial issue in low bit rate harmonic speech coders.
Recently, several techniques have been developed for the quantization of variable dimension spectral vectors. In R. J. McAulay and T. F. Quatieri “Sinusoidal Coding”, in Speech Coding and Synthesis (W. B. Kleijn and K. K. Paliwal, edts.), Amsterdam, Elsevier Science Publishers, 1995, and S. Yeldener, A. M. Kondoz, B. G. Evans “Multi-Band Linear Predictive Speech Coding at Very Low Bit Rates” IEEE Proc. Vis. Image and Signal Processing, October 1994, Vol. 141, No. 5, pp. 289-295, an all-pole (LP) model is used to approximate the spectral envelope using a fixed number of parameters. These parameters can be quantized using fixed dimension Vector Quantization (VQ). In Band Limited Interpolation (BLI), e.g., described by M. Nishignchi, J. Matsumoto, R. Walcatsuld and S. Ono “Vector Quantized MBE with simplified V/LV decision at 3 Kb/s”, Proc. of ICASSP-93, pp. II-151-154, the variable dimension vectors are converted into fixed dimension vectors by sampling rate conversion which preserves the shape of the spectral envelope. The concept of spectral bins for the dimension conversion is employed in variable dimension vector quantization (VDVQ), described by A. Das, A. V. Rao, A. Gersho “Variable Dimension Vector Quantization of Speech Spectra for Low Rate Vocoders” Proc. of Data Compression Conf. Pp. 421-429, 1994. In VDVQ, the spectral axis is divided into segments, or bins and each spectral sample is mapped onto the closest spectral bin to form a fixed dimension vector for quantization. A truncation method (P. Hedelin “A tone oriented voice excited vocoder” Proc. of ICASSP-81, pp. 205-208, and a zero padding method (E. Shlomot, V. Cuperman and A. Gersho “Combined Harmonic and Waveform Coding of Speech at Low Bit Rates” Proc. ICASSP-98, pp. 585-588) convert the variable dimension vector to a fixed dimension vector by simply truncating or zero padding, respectively. Another method for the quantization of the spectral amplitudes is the linear dimension conversion which is called non-square transform VQ (NSTVQ), described by P. Lupini, V. Cuperman “Vector Quantization of harmonic magnitudes for low rate speech coders” Proc. IEEE Globecorn, 1994.
All of these schemes mentioned above are not very efficient methods to quantize the spectral amplitudes with a minimal distortion using only a few bits.
SUMMARY OF THE INVENTION
It is an object of the invention to provide an improved method of quantizing spectral amplitudes, to provide a higher degree of transmission efficiency and performance.
In accordance with this invention, two consecutive frames are grouped and quantized together. The spectral amplitude gain for the second sub-frame is quantized using a 5-bit non-uniform scalar quantizer. Next, the shape of the spectral harmonic amplitudes are split into odd and even harmonic amplitude vectors. The odd vector is converted to LOG and then DCT domain, and then quantized using 8 bits. The even vector is converted to LOG and then used to generate a difference vector relative to the quantized odd LOG vector and the difference vector, and this difference vector is then quantized using 5 bits. Since the vector quantizations for spectral amplitudes can be done in the DCT domain, a weighting can be used that gives more emphasis to the low order DCT coefficients than the higher order ones. In the end, a total of 18 bits are used for spectral amplitudes of the second frame.
The spectral amplitudes for the first frame are quantized based on optimal linear interpolation techniques using the spectral amplitudes of the previous and next frames. Since the spectral amplitudes have variable dimension from one frame to the next, an interpolation algorithm is used to convert variable dimension spectral amplitudes into a fixed dimension. Further interpolation between the spectral amplitude values of the previous and next frames yields multiple sets of interpolated values, and comparison of these to the original interpolated (i.e., fixed dimension) spectral amplitude values for the current frame yields an error signal. The best interpolated spectral amplitudes are then chosen in accordance with a mean squared error (MSE) approach, and the chosen amplitude values (or an index representing the same) are quantized using three bits.


REFERENCES:
patent: 5495555 (1996-02-01), Swaminathan
patent: 5504833 (1996-04-01), George et al.
patent: 5577159 (1996-11-01), Shoham
patent: 5583888 (1996-12-01), Ono
patent: 5623575 (1997-04-01), Fette et al.
patent: 5630011 (1997-05-01), Lim et al.
patent: 5809455 (1998-09-01), Nishuguchi et al.
patent: 5832437 (1998-11-01), Nishiguchi et al.
patent: 6018707 (2000-01-01), Nishiguchi et al.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Efficient quantization of speech spectral amplitudes based... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Efficient quantization of speech spectral amplitudes based..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Efficient quantization of speech spectral amplitudes based... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2881598

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.