Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission
Reexamination Certificate
1999-06-15
2004-04-20
Dorvil, Richemond (Department: 2655)
Data processing: speech signal processing, linguistics, language
Speech signal processing
For storage or transmission
C704S500000, C704S200100
Reexamination Certificate
active
06725192
ABSTRACT:
BACKGROUND OF THE INVENTION
(1) Field of the Invention
The present invention relates to an audio coding and quantization method which is appropriate for various applications including the fields of audio signal storage, communication and broadcasting applications.
(2) Description of the Related Art
Digital representations of analog waveforms introduce some kind of distortions. A basic problem in the design of source coders is to achieve a given acceptable level of distortion with the smallest possible encoding bit rate. To reach this goal the encoding algorithm must be adapted both to the changing statistics of the source signal and to auditory perception. Auditory perception is based on critical band analyses in the human ear. The power spectra are not represented on a linear frequency scale but on the frequency bands, called critical bands, with bandwidths on the order of 100 Hz below 500 Hz and with increasing bandwidths (up to 500 Hz) at high signal frequencies. Within critical bands the intensities of individual tones are summed by the ear. Up to 20,000-Hz bandwidth 26 critical bands have to be taken into account. Audio coders that exploit auditory perception must be based on critical-band structured signal processing.
Auditory masking describes the effect that a low-level audio signal (called the maskee) can become inaudible when a louder signal (called the masker) occurs simultaneously. The effect of simultaneous masking and temporal masking can be exploited in audio coding by transmitting only those details of the signal which are perceptible by ear. Such coders provide high coding quality without providing high signal-to-noise ratios.
Hereinafter, the lower limit of a sound pressure level from which any signal will not be audible due to the masker is called a masking threshold. It is also known as a threshold of just noticeable distortion in the context of source coding.
Generally, audio signals in the vicinity of 4 kHz are very perceptible by the human ear regardless of whether the masker is present. Hereinafter, the lower limit of a sound pressure level that is audible to the human ear is called an absolute hearing threshold. It is also known as a threshold in quiet.
FIG. 6
shows a relationship between the absolute hearing threshold and the masking threshold in a spectral distribution of audio signal.
Without a masker, an audio signal (A) (indicated by the solid line in
FIG. 6
) is inaudible if its sound pressure level is below the absolute hearing threshold (C) (indicated by the two-dot chain line in
FIG. 6
) which depends on frequency. The sound pressure level that is equal to 0 dB relates to a sound pressure of 0.02 mN/m
2
. In the presence of a masker, the masking threshold (B) (indicated by the dotted line in
FIG. 6
) can be measured below which any signal will not be audible. The masking threshold depends on the sound pressure level, the frequency of the masker, and on the characteristics of masker and maskee.
In addition to simultaneous masking of one sound by another one occurring at the same time, temporal masking occurs when two sounds appear within a small interval of time; the stronger one masks the weaker one, regardless of whether the latter one occurs before or after it. Temporal masking can be used to mask pre-echoes caused by the spreading of a sudden large quantization error over the actual coding block.
The effect of simultaneous masking and temporal masking can be exploited in audio coding by transmitting only those details of the signal which are perceptible by ear. It is equivalent to a bit allocation by which the necessary bits for encoding the bitstream are allocated to only the portions of the audio signal (A) which are above the masking threshold (B) and the absolute hearing threshold (C). In the audio coding, the audio signal is divided into a number of spectral subband components (D) (indicated by the one-dot chain lines in
FIG. 6
) and each component is quantized whereby the number of quantizer levels for each component is obtained from the bit allocation.
The width of each subband component (D) is equivalent to the bandwidth of the audio signal. In each subband the signal component the intensity of which is below a certain lower limit will not be audible. As long as the difference in intensity between the source signal and the decoded signal is below the lower limit, the decoded signal will be indistinguishable from the source signal. Hereinafter, the lower limit of a sound pressure level for each subband is called an allowed distortion level. In the context of audio coding, if the level of a quantization error produced by the quantization of an audio signal is below the allowed distortion level, the audio coding can provide high coding quality without providing high signal-to-noise ratios. The bit allocation for each subband component (D), as shown in
FIG. 6
, is equivalent to controlling the quantization of the audio signal such that the quantization error level for each subband is exactly equal to the allowed distortion level.
As disclosed in Japanese Laid-Open Patent Application No. 7-154266, an audio coding and quantization algorithm for digital audio signals is known. In the audio coding method of the above publication, a digital audio signal is converted into blocks of spectral data, and each block is divided into units of normalized coefficients. An upper limit of the number of bits allocated per block is fixed. The bit allocation is controlled by using the fixed upper limit. For the blocks with the number of needed bits that exceeds the upper limit of the number of allocated bits, the normalized coefficients of the related unit are forcefully corrected so that the numbers of needed bits for all the blocks are below the upper limit.
International Standard ISO/IEC 13818-7 provides a generic audio coding and quantization algorithm for digital audio signals. In the audio coding and quantization method of this standard, it is difficult to speedily carry out an iterative process that converges when the total bit count is within some interval surrounding the allocated bit count, while preventing the degradation of coding quality due to nonconvergence. If both a bit rate requirement and a masking requirement are not finally met, it is likely to cause the degradation of coding quality. Further, in the above-described method of International Standard ISO/IEC 13818-7, when the check of the masking requirement is done, the quantization error levels of all the subbands are not always less than the allowed distortion levels. Even if both the bit rate requirement and the masking requirement are finally met, it requires a relatively large computing time until the convergence is reached. As long as the masking requirement is not met, the bit allocation control must be repeated many times. The repeated bit allocation control includes some redundant processes.
In the conventional method of the above publication (Japanese Laid-Open Patent Application No. 7-154266), the same problem remains unresolved. It is difficult to speedily carry out the iterative process that converges when the total bit count is within some interval surrounding the allocated bit count, while preventing the degradation of coding quality due to nonconvergence.
SUMMARY OF THE INVENTION
An object of the present invention is to provide an improved audio coding and quantization method in which the above-described problems are eliminated.
Another object of the present invention is to provide an audio coding and quantization method which is effective in speedily carrying out an iterative process that converges when the total bit count is within some interval surrounding the allocated bit count, while preventing the degradation of coding quality due to nonconvergence.
Still another object of the present invention is to provide an audio coding and quantization method which is effective in providing high coding quality without providing high signal-to-noise ratios.
The above-mentioned objects of the present invention are achieved by an audio coding and quantization method which
Dickstein , Shapiro, Morin & Oshinsky, LLP
Dorvil Richemond
Opsasnick Michael N.
Ricoh & Company, Ltd.
LandOfFree
Audio coding and quantization method does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Audio coding and quantization method, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Audio coding and quantization method will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3235655