Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission
Patent
1993-12-06
1998-06-09
MacDonald, Allen R.
Data processing: speech signal processing, linguistics, language
Speech signal processing
For storage or transmission
704214, 704221, 704230, G10L 300
Patent
active
057651272
DESCRIPTION:
BRIEF SUMMARY
TECHNICAL FIELD
This invention relates to a high efficiency encoding method for encoding data on the frequency axis produced by dividing input audio signals, such as voice signals or acoustic signals, on the block-by-block basis, and transforming the audio signals into signals on the frequency axis.
BACKGROUND ART
A variety of encoding methods have been known, in which signal compression is carried out by utilizing statistical characteristics of audio signals, including voice signals and acoustic signals, in the time domain and in the frequency domain, and characteristics of human auditory sense. These encoding methods are roughly divided into encoding in the time domain, encoding in the frequency domain and analysis-synthesis encoding.
As an example of high efficiency encoding of voice signals, when quantizing various information data, such as spectral amplitude or parameters thereof, like LSP parameters, a parameters or k parameters, in partial auto-correlation (PARCOR) analysis-synthesis encoding, multi-band excitation encoding (MBE), single-band excitation encoding (SBE), harmonic encoding, side-band coding (SBC), linear predictive coding (LPC), discrete cosine transform (DCT), modified DCT (MDCT) or fast Fourier transform (FFT), it has been customary to carry out scalar quantization.
Meanwhile, in the voice analysis-synthesis system such as the PARCOR method, since the timing of changing over the excitation source is on the block-by-block (frame-by-frame) basis on the time axis, voiced and unvoiced sounds cannot exist jointly within the same frame. As a result, it has been impossible to produce high-quality voices.
However, in the MBE encoding, the band for voices within one block (frame) is divided into plural bands, and voiced/unvoiced decision is performed for each of the bands. Thus, improvements to sound quality can be observed. However, the MBE encoding is disadvantageous in terms of bit rate, since voiced/unvoiced decision data obtained for each band must be transmitted separately.
Also, scalar quantization has been difficult to implement because of the increased quantization noise if it is attempted to lower the bit rate to e.g. about 3 to 4 kbps for further increasing the quantization efficiency.
It may be contemplated to adopt vector quantization. However, with the number of bits b of an output (index) of the vector quantization, the size of a codebook of a vector quantizer is increased in proportion to 2.sup.b, and the operation volume for codebook search is also increased in proportion to 2.sup.b. However, since an extremely small number b of bits of output increases the quantization noise, it is desirable to reduce the size of the codebook or the operation quantity for codebook search while maintaining a certain larger value of the bit number b. Besides, the coding efficiency cannot be increased sufficiently if the data transformed into those on the frequency axis are directly processed by vector quantization. Thus, a technique for further increasing the compression ratio is needed.
In view of the above-described status of the art, it is an object of the present invention to provide a high efficiency encoding method whereby the voiced/unvoiced sounds decision data produced for each band may be transmitted with a reduced number of bits without deteriorating the sound quality.
It is another object of the present invention to provide a high efficiency encoding method whereby the size of the codebook for the vector quantizer or the operation volume for codebook search can be diminished without lowering the number of output bits of vector quantization, and whereby the compression ratio at the time of vector quantization can be increased further.
DISCLOSURE OF THE INVENTION
According to the present invention there is provided a high-efficiency encoding method comprising the steps of: dividing input voice signals into blocks with each block as a unit and transforming the voice signals into signals on a frequency axis to find corresponding data on the frequency axis; dividing the data on t
REFERENCES:
patent: 4710812 (1987-12-01), Murakami et al.
patent: 5010574 (1991-04-01), Wang
patent: 5272529 (1993-12-01), Frederiksen
patent: 5274741 (1993-12-01), Taniguchi et al.
patent: 5361323 (1994-11-01), Murata et al.
patent: 5440345 (1995-08-01), Shimoda
patent: 5473727 (1995-12-01), Nishiguchi et al.
Gersho et al., ("Variable Rate vector quantization", Vector Quantization and Signal Compression, Gersho et al. Kluwer Academic Publishers, pp. 127, 204-206, 461-470, 602-605, 631-640, Nov. 1991).
Gersho et al., ("Vector Quantization Techniques in Speech Coding", and Pitch and Voicing Determination Advances in Speech Signal Processing, Editors, Furui and Sondhi, Dekker, 1991, pp. 3-84), Jan. 1991.
Matsumoto Jun
Nishiguchi Masayuki
Ono Shinobu
Chawan Vijay B.
MacDonald Allen R.
LandOfFree
High efficiency encoding method does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with High efficiency encoding method, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and High efficiency encoding method will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2215818