Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission
Reexamination Certificate
2002-09-05
2003-10-07
Dorvil, Richemond (Department: 2697)
Data processing: speech signal processing, linguistics, language
Speech signal processing
For storage or transmission
C704S219000
Reexamination Certificate
active
06631347
ABSTRACT:
CROSS-REFERENCE TO RELATED APPLICATIONS
This application is based upon and claims priority from Korean Patent Application No. 2002-25401 filed May 8, 2002, the contents of which are incorporated herein by reference.
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to coding technology for speech signals, and more particularly, to a vector quantization and decoding apparatus providing high encoding efficiency for speech signals and method thereof.
2. Description of the Related Art
To obtain low-bit-rate coding capable of preventing degradation of the quality of sound, vector quantization is preferred over scalar quantization because the former has memory, space-filling and shape advantages.
Conventional vector quantization technique for speech signals includes direct vector quantization (hereinafter, referred to as DVQ) and the code-excited linear prediction (hereinafter, referred to as CELP) coding technique.
If the signal statistics are given, DVQ provides the highest coding efficiency. However, the time-varying signal statistics of a speech signal require a very large number of codebooks. This makes the storage requirements of DVQ unmanageable.
CELP uses a single codebook. Thus, CELP does not require large storage like DVQ. The CELP algorithm consists of extracting linear prediction (hereinafter, referred to as LP) coefficients from an input speech signal, constructing from the code vectors stored in the codebook trial speech signals using a synthesis filter whose filtering characteristic is determined by the extracted LP coefficients, and searching for the code vector with a trial speech signal most similar to that of the input speech signal.
For CELP, the Voronoi-region shape of the code vectors stored in the codebooks may be nearly spherical, as shown in
FIG. 1A
for the two-dimensional case, while the trial speech signals constructed by a synthesis filter do not have a spherical Voronoi-region shape, as shown in FIG.
1
B. Therefore, CELP does not sufficiently utilize the space-filling and shape advantages of vector quantization.
SUMMARY OF THE INVENTION
To solve the above-described problems, it is an objective of the present invention to provide a vector quantization and decoding apparatus and method that can sufficiently utilize the VQ advantages upon coding of speech signals.
Another objective of the present invention is to provide a vector quantization and decoding apparatus and method in which an input speech is quantized with modest calculation and storage requirements, by vector-quantizing a speech signal using code vectors obtained by the Karhunen-Loéve Transform (KLT).
Still another objective of the present invention is to provide a KLT-based classified vector and decoding apparatus by which the Voronoi-region shape for a speech signal is kept nearly spherical, and a method thereof.
In order to achieve the above objectives, the present invention provides a vector quantization apparatus including a codebook group, a KLT unit, first and second selection units, and a transmission unit. The codebook-group has a plurality of codebooks that store the code vectors for a speech signal obtained by KLT, and the codebooks are classified according to KLT-domain statistics of the speech signal. The KLT unit transforms an input speech signal to a KLT domain. The first selection unit selects an optimal codebook from the codebooks on the basis of the eigenvalue set for the covariance matrix of the input speech signal obtained by the KLT. The second selection unit selects an optimal code vector on the basis of the distortion between each of the code vectors carried on the selected codebook and the speech signal transformed to a KLT domain by the KLT unit. The transmission unit transmits the index of the optimal code vector to the decoding side so that the optimal code vector is used as the data of vector quantization for the input speech signal.
Each codebook is associated with a signal class on the basis of the eigenvalues of the covariance matrix of the speech signal. The KLT unit performs the following operations. First, the KLT unit calculates the linear prediction (LP) coefficient of the input speech signal, obtains a covariance matrix using the LP coefficients, and calculates a set of eigenvalues for the covariance matrix and eigenvectors corresponding to the eigenvalues. Then, the KLT unit obtains an eigenvalue matrix based on the eigenvalue set and also a unitary matrix on the basis of the eigenvectors. Thereafter, the KLT unit obtains a KLT domain representation for the input speech signal using the unitary matrix.
Preferably, the first selection unit selects a codebook with an eigenvalue set similar to the eigenvalue set calculated by the KLT unit. Preferably, the second selection unit selects a code vector having a minimum distortion value so that the code vector used is the optimal code vector.
In order to achieve the above objectives, the present invention also provides a vector quantization method for speech signals in a system including a plurality of codebooks that store the code vectors for a speech signal. According to this method, an input speech signal is transformed to a KLT domain. A codebook corresponding to the input speech signal is selected from the codebooks on the basis of the eigenvalue set of the covariance matrix of the input speech signal detected according to the KLT of the input speech signal. An optimal code vector is selected on the basis of the distortion value between each of the code vectors stored in the selected codebook and the KL-transformed speech signal. The selected code vector is transmitted so that it is used as a vector quantization value for the input speech signal.
The KLT-based transformation of an input speech signal is performed by the following steps. First, the LP coefficients of the input speech signal are estimated. Then, the covariance matrix for the input speech signal is obtained, and the eigenvalues for the covariance matrix and the eigenvectors for the eigenvalues are calculated. The unitary matrix for the speech signal is also obtained using the eigenvector set. The input speech signal is transformed to a KLT domain using the unitary matrix.
Preferably, the selected codebook is a codebook that corresponds to an eigenvalue set similar to the estimated eigenvalue set. Preferably, a code vector having a minimum distortion is selected as the optimal code vector.
REFERENCES:
patent: 4907276 (1990-03-01), Aldersberg
patent: 5544277 (1996-08-01), Bakis
patent: 5950155 (1999-09-01), Nishiguchi
patent: 6151414 (2000-11-01), Lee et al.
patent: 6389388 (2002-05-01), Lin
patent: 6415254 (2002-07-01), Yasunaga
Dony, R. D. and Haykin, S. “Neural network approaches to image compression,” Proceedings of the IEEE, vol. 83, Issue 2, p 288-303, Feb. 1995.*
Kim, Tae-Yong et al. “KLT-based adaptive vector quantization using PCNN,” IEEE International Conference on Systems, Ma and, Cybernetics, vol. 1, pp. 82-87, Oct. 1996.
Kim Moo Young
Kleijn Willem Bastiaan
Burns Doane Swecker & Mathis L.L.P.
Dorvil Richemond
Patel Kinari
Samsung Electronics Co,. Ltd.
LandOfFree
Vector quantization and decoding apparatus for speech... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Vector quantization and decoding apparatus for speech..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Vector quantization and decoding apparatus for speech... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3159611