Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission
Reexamination Certificate
1997-09-11
2003-08-26
Banks-Harold, Marsha D. (Department: 2654)
Data processing: speech signal processing, linguistics, language
Speech signal processing
For storage or transmission
C704S222000
Reexamination Certificate
active
06611800
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to a vector quantization method in which an input vector is compared to code vectors stored in a codebook for outputting an index of an optimum one of the code vectors. The present invention also relates to a speech encoding method and apparatus in which an input speech signal is divided in terms of a pre-set encoding unit, such as a block or a frame, and encoding processing including vector quantization is carried out on the encoding unit basis.
2. Description of the Related Art
There has hitherto been known vector quantization in which, for digitizing and compression-encoding audio or video signals, a plurality of input data are grouped together into a vector for representation as a sole code (index).
In such vector quantization, representative patterns of a variety of input vectors are previously determined by, for example, learning, and given codes or indices, which are then stored in a codebook. The input vector is then compared to the respective patterns (code vectors) by way of pattern matching for outputting the code of the pattern bearing the strongest similarity or correlation. This similarity or correlation is found by calculating the distortion measure or an error energy between the input vector and the respective code vectors and becomes higher as the distortion or error becomes smaller.
There have hitherto been known a variety of encoding methods exploiting statistic properties in the time domain or frequency domain and psychoacoustic properties of the human being in signal compression. This encoding method is roughly classified into encoding in the time domain, encoding in the frequency domain and analysis-by-synthesis encoding.
Among examples of high-efficiency encoding of a speech signal, there are sinusoidal wave analytic encoding, such as a harmonic encoding, a sub-band coding (SBC), linear predictive coding (LPC), discrete cosine transform (DCT), modified DCT (MDCT) or fast Fourier transform (FFT).
In high-efficiency encoding of the speech signals, the above-mentioned vector quantization is used for parameters such as spectral components of the harmonics.
Meanwhile, if the number of the patterns stored in the codebook, that is the number of the code vectors, is large, or if the vector quantizer is of a multi-stage configuration made up of plural codebooks, combined together, the number of times of code vector search operations for pattern matching is increased to increase the processing volume. In particular, if plural codebooks are combined together, processing for finding the similarity of the number of multiplications of the number of code vectors in the codebooks becomes necessary, thereby increasing the codebook search processing volume significantly.
SUMMARY OF THE INVENTION
It is therefore an object of the present invention to provide a vector quantization method, a speech encoding method and a speech encoding apparatus capable of suppressing the codebook search processing volume.
For accomplishing the above object, the present invention provides a vector quantization method including a step of finding the degree of similarity between an input vector to be vector quantized and all code vectors stored in a codebook by approximation for pre-selecting plural code vectors bearing a high degree of similarity and a step of ultimately selecting one of the plural pre-selected code vectors that minimizes an error with respect to the input vector.
By executing ultimate selection after the pre-selection, a smaller number of candidate code vectors are selected by pre-selection involving simplified processing and subjected to ultimate selection of high precision to reduce the processing volume for codebook searching.
The codebook is constituted by plural codebooks from each of which can be selected plural code vectors representing an optimum combination. The degree of similarity may be an inner product of the input vector and the code vector, optionally divided by a norm or a weighted norm of each code vector.
The present invention also provides a speech encoding method in which an input speech signal or short-term prediction residuals thereof are analyzed by sinusoidal analysis to find spectral components of the harmonics and in which parameters derived from the encoding-unit-based spectral components of the harmonics, as the input vector, are vector quantized for encoding. In the vector quantization, the degree of similarity between the input vector and all code vectors stored in a codebook is found by approximation for pre-selecting a smaller plural number of the code vectors having a high degree of similarity, and one of these pre-selected code vectors which minimizes an error with respect to the input vector is selected ultimately.
The degree of similarity may be an optionally weighted inner product between the input vector and the code vector optionally divided by a norm or a weighted norm of each code vector. For weighting the norm, a weight having a concentrated energy towards the low frequency range and a decreasing energy towards the high frequency range may be used. Thus, the degree of similarity can be found by dividing the weighted inner product of the code vector by the weighted code vector norm.
The present invention is also directed to a speech encoding device for carrying out the speech encoding method.
REFERENCES:
patent: 5307441 (1994-04-01), Tzeng
patent: 5451951 (1995-09-01), Elliott et al.
patent: 5677986 (1997-10-01), Amada et al.
patent: 5774838 (1998-06-01), Miseki et al.
patent: 5778335 (1998-07-01), Ubale et al.
patent: 5819213 (1998-10-01), Oshikiri et al.
patent: 5890110 (1999-03-01), Gersho et al.
patent: 5926788 (1999-07-01), Nishiguchi
patent: 5950155 (1999-09-01), Nishiguchi
patent: 5960386 (1999-09-01), Janiszewski et al.
patent: 6003001 (1999-12-01), Maeda
patent: 6018707 (2000-01-01), Nishiguchi et al.
patent: 0770989 (1996-10-01), None
Trancoso et al., “High Quality Mid-Rate Speech Coding,” Electrotechnical Conference, 1989. Proceedings. ‘Integrated Research, Industry and Education in Energy and Communication Engineering,’ MELECON '89., Mediterranean, pp. 217-220, Apr. 1989.*
Nagaratnam et al., “Spectral Magnitude Modelling for Sinusoidal Coding,” 1995 IEEE Workshop on Speech Coding for Telecommunications, pp. 81-82, Sep. 1995.*
Das et al., “Variable-dimension vector quantization of speech spectra for low-rate vocoders,” DCC '94 Proceedings, Data Compression Conference, Mar. 1994, pp. 420 to 429.*
Akitoshi Kataoka, et al., “An 8-kbit/s Speech Coder Based On Conjugate Structure CELP,” IEEE, Apr. 27, 1993.
Masayuki Nishiguchi, et al., Harmonic and Noise Coding of LPC Residuals With Classified Vector Quantization, IEEE, May 9, 1995.
M. Elshafei, et al., “Fast Methods for Code Search in CELP,” IEEE, Jul., 1993.
Iijima Kazuyuki
Matsumoto Jun
Nishiguchi Masayuki
Banks-Harold Marsha D.
Lerner Martin
Maioli Jay H.
Sony Corporation
LandOfFree
Vector quantization method and speech encoding method and... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Vector quantization method and speech encoding method and..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Vector quantization method and speech encoding method and... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3120484