Image analysis – Image compression or coding
Reexamination Certificate
1999-11-19
2003-12-09
Couso, Jose L. (Department: 2621)
Image analysis
Image compression or coding
Reexamination Certificate
active
06661923
ABSTRACT:
TECHNICAL FIELD
This invention relates to a coding device and method for generating a code string by changing the compression rate of a code string generated by code string generation processing in accordance with limitation of the capacity of a transmission line or the like. The invention also relates to a decoding device and method for decoding a code string having the compression rate changed in accordance with the coding device and method. The invention also relates to a program recording medium for recording the coding method and the decoding method as software programs. The invention further relates to a data recording medium in which a code string having the compression rate changed in accordance with the coding method is recorded.
BACKGROUND ART
There are various techniques of high-efficiency coding of audio signals (including speech signals). For example, there is known a subband coding (SBC) technique, which is a non-blocked frequency subband coding system for splitting audio signals on the time base into a plurality of frequency bands and coding the plurality of frequency bands without blocking the audio signals, and a blocked frequency subband coding system, that is, a so-called transform coding system for converting (by spectrum conversion) signals on the time base to signals on the frequency base, then splitting the signals into a plurality of frequency bands, and coding the signals of each band. Also, a high-efficiency coding technique which combines the above-described subband coding and transform coding is considered. In this case, after band splitting is carried out in accordance with the subband coding, the signals of each band are spectrum-converted to signals on the frequency base and the spectrum-converted signals of each band are coded.
As a filter for the above-described band splitting, a QMF (quadrature mirror filter) is employed. This QMF filter is described in R. E. Crochiere, Digital coding of speech in subbands, Bell Syst. Tech. J. Vol. 55, No. 8, 1976. Also, a bandwidth filter splitting technique is described in Joseph H. Rothweiler, Polyphase Quadrature filters—A new subband coding technique, ICASSP 83, BOSTON.
As the above-described spectrum conversion, there is known spectrum conversion in which input audio signals are blocked on the basis of a predetermined unit time (frame) and converted from the tune base to the frequency base by carrying out discrete Fourier transform (DFT), discrete cosine transform (DCT) or modified discrete cosine transform (MDCT) for each block. MDCT is described in J. P. Princen, A. B. Bradley, Subband/Transform Coding Using Filter Bank Designs Based on Time Domain Aliasing Cancellation, Univ. of Surrey, Royal Melbourne Inst. of Tech., ICASSP 1987.
As the signals split into each band by filtering or spectrum conversion are thus quantized, a band where quantization noise is generated can be controlled and more auditorily efficient coding can be carried out by utilizing the characteristics such as a masking effect. If normalization is carried out for each band with the maximum value of absolute values of signal components in each band before quantization is carried out, more auditorily efficient coding can be carried out.
With respect to the frequency splitting width for quantizing each frequency component obtained by frequency band splitting, for example, band splitting in consideration of human auditory characteristics is carried out. Specifically, audio signals are split into a plurality of bands (for example, 25 bands) with a bandwidth broader in higher frequency areas, generally referred to as critical bands. In coding the data of each band in this case, predetermined bit distribution for each band or adaptive bit allocation for each band is carried out. For example, in coding coefficient data obtained by MDCT processing by using bit allocation, the MDCT coefficient data of each band obtained by MDCT processing for each block is coded with an adaptive number of allocated bits. Two techniques for such bit allocation are known.
One technique is disclosed in R. Zelinski and P. Noll, Adaptive Transform Coding of Speech Signals, IEEE Transactions of Acoustics, Speech, and Signal Processing, vol. ASSP-25, No. 4, August 1977. In this technique, bit allocation is carried out on the basis of the magnitude of signals of each band. In accordance with this technique, the quantization noise spectrum is flat and the noise energy is minimum. However, since the masking effect is not utilized auditorily, the actual sense of noise is not optimum.
The other technique is disclosed in M. A. Kransner, The critical band coder—digital encoding of the perceptual requirements of the auditory system, MIT, ICASSP 1980. In this technique, fixed bit allocation is carried out by utilizing the auditory masking effect and thus obtaining a necessary signal-to-noise ratio for each band. In this technique, however, since bit allocation is fixed, a satisfactory characteristic value is not obtained even when characteristics are measured with a sine-wave input.
In order to solve these problems, there is proposed a high-efficiency coding device for divisionally using all the bits usable for bit allocation, for a predetermined fixed bit allocation pattern of each subblock and for bit distribution depending upon the magnitude of signals of each block, and causing the division ratio to depend upon the signals related with input signals so that the division rate for the fixed bit allocation is increased as the spectrum of the signals becomes smoother.
According to this method, in the case where the energy is concentrated at a specified spectrum as in a sine wave input, a large number of bits are allocated to a block including that spectrum, thereby enabling significant improvement in the overall signal-to-noise characteristic. Since the human auditory sense is generally acute to a signal having a steep spectral component, improvement in the signal-to-noise characteristic by using such a method not only leads to improvement in the numerical value of measurement but also is effective for improving the sound quality perceived by the auditory sense.
In addition to the foregoing methods, various other methods for bit allocation are proposed. Therefore, if a fine and precise model with respect to the auditory sense is realized and the capability of the coding device is improved, auditorily more efficient coding can be carried out.
For example, the present Assignee has proposed a method for separating tonal components which are particularly important in terms of the auditory sense from spectral signals and coding these tonal components separately from the other spectral components. Thus, it is possible to efficiently code audio signals at a high compression rate without generating serious deterioration in the sound quality perceived by the auditory sense.
In the case where DFT or DCT is used as a method for converting waveform signals to the spectrum, M units of independent real-number data are obtained by carrying out conversion with a time block consisting of M samples. In general, M
1
samples of each of adjacent blocks are caused to overlap each other in order to reduce connection distortion between time blocks. Therefore, in DFT or DCT, M units of real-number data are quantized and coded with respect to (M-M
1
) samples on the average.
On the other hand, in the case where MDCT is used as a method for conversion to the spectrum, M units of independent real-number data are obtained from
2
M samples having M samples caused to overlap M samples of the adjacent period. Therefore, M units of real-number data are quantized and coded with respect to M samples on the average.
In a decoding device, waveform elements obtained by inversely converting each block of codes thus obtained by using MDCT are added to each other while being caused to interfere with each other. Thus, waveform signals can be reconstituted.
In general, by elongating the time block for conversion, the frequency resolution of spectrum is increased and the energy is concentrated at a specified spectra
Imai Kenichi
Koike Takashi
Tsuji Minoru
LandOfFree
Coding device, coding method, decoding device, decoding... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Coding device, coding method, decoding device, decoding..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Coding device, coding method, decoding device, decoding... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3133262