Data processing: speech signal processing – linguistics – language – Audio signal bandwidth compression or expansion
Reexamination Certificate
1998-03-02
2001-07-17
Dorvil, Richemond (Department: 2741)
Data processing: speech signal processing, linguistics, language
Audio signal bandwidth compression or expansion
C704S230000, C704S229000
Reexamination Certificate
active
06263312
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The invention relates to the field of signal processing. More specifically, the invention relates to the field of audio data compression and decompression utilizing subband decomposition (audio is used herein to refer to one or more types of sound such as speech, music, etc.).
2. Background Information
To allow typical signal/data processing devices to process (e.g., store, transmit, etc.) audio signals efficiently, various techniques have been developed to reduce or compress the amount of data required to represent an audio signal. In applications wherein real-time processing is desirable (e.g., telephone conferencing over a computer network, digital (wireless) communications, multimedia over a communications medium, etc.), such compression techniques may be an important consideration, given limited processing bandwidth and storage resources.
In typical audio compression systems, the following steps are generally performed: (1) a segment or frame of an audio signal is transformed into a frequency domain; (2) the transform coefficients representing the frequency domain, or a portion thereof, are quantized into discrete values; and (3) the quantized values are converted (or coded) into a binary format. The encoded/compressed data can be output, stored, transmitted, and/or decoded/decompressed.
To achieve relatively high compression/low bit rates (e.g., 8 to 16 kbps) for various types of audio signals some compression techniques (e.g., CELP. ADPCM, etc.) limit the number of components in a segment (or frame) of an audio signal which is to be compressed. Unfortunately, such techniques typically do not take into account relatively substantial components of an audio signal. Thus, such techniques typically result in a relatively poor quality synthesized audio signal due to the loss of information.
One method of audio compression that allows relatively high quality compression/decompression involves transform coding. Transform coding typically involves transforming a frame of an input audio signal into a set of transform coefficients, using a transform, such discrete cosine transform (DCT), modified discrete cosine transform (MDCT), Fourier and Fast Fourier Transform (FFT). etc. Next, a subset of the set of transform coefficients, which typically represents most of the energy of the input audio signal (e.g., over 90%), is quantized and encoded using any number of well-known coding techniques. Transform compression techniques, such as DCT, generally provide a relatively high quality synthesized signal, since a relatively high number of spectral components of an input audio signal are taken into consideration.
Past transform audio compression techniques may have some limitations. First, transform techniques typically perform a relatively large amount of computation, and may also use relatively high bit rates (e.g., 32 kbps), which may adversely affect compression ratios. Second, while the selected subset of coefficients may accumulatively contain approximately 90% of the energy of an input audio signal, the discarded coefficients may be needed for relatively high quality reproduction. However, a substantial amount of bits may be required to transform encode all of the coefficients representing a frame of the input audio signal. Finally, an audible “echo” or other type of distortion may result in an audio signal that is synthesized from transform coding techniques. One cause of echo is the limitations of transform coding techniques to approximate satisfactorily a fast-varying signal (e.g., a drum “attack”). As a result, quantization error for one or a few transform coefficients may spread over and adversely affect an entire frame, or portion thereof, of a transform encoded audio signal.
To illustrate distortion, such as echo, in a transform encoded synthesized signal, reference is made to
FIGS. 1A and 1B
.
FIG. 1A
a graphical representation of a frame of an input (i.e., original/unprocessed) audio signal.
FIG. 1B
depicts a synthesized signal that generated by transform encoding and synthesizing the input signal of FIG.
1
A. In
FIGS. 1A and 1B
, the horizontal (x) axis represents time, while the vertical (y) axis represents amplitude. As shown, the synthesized signal contains relatively substantial distortion (e.g., echo) from the time period 0 to 175 (sometimes referred to as pre-echo, since the distortion precedes the signal (or harmonic) “attack” at time=~175) and 375 to 475 (sometimes referred to as post-echo, since the distortion follows the signal “attack” at time=~175), relative to the corresponding input signal of FIG.
1
A.
While some past systems, such as ISO/MPEG audio codes, have employed techniques to diminish distortion due to transform coding, such as pre-echo, such techniques typically rely on an increased number of bits to encode the input signal. As such, compression ratios may be diminished as a result of past distortion reduction techniques.
Thus, what is desired is a system that achieves relatively high quality audio data compression, while achieving relatively low bit rates (e.g., high compression ratios). It is further desirable to detect and reduce distortion (e.g., noise, echo, etc.) that may result, for example, by generating a transform encoded synthesized signal, while providing a relatively low bit rate.
SUMMARY OF THE INVENTION
The present invention provides a method and apparatus to achieve relatively high quality audio data compression/decompression, while achieving relatively low bit rates (e.g., high compression ratios). According to one aspect of the invention, a residual signal is subband decomposed and adaptively quantized and encoded to capture frequency information that may provide higher quality compression and decompression relative to transform encoding techniques. According to a second aspect of the invention, an input audio signal is compared to an encoded version of that input audio signal to detect and reduce, as necessary, distortion in the encoded signal or portions thereof.
REFERENCES:
patent: 5451954 (1995-09-01), Davis et al.
patent: 5602961 (1997-02-01), Kolesnik et al.
patent: 5627938 (1997-05-01), Johnston
patent: 5632003 (1997-05-01), Davidson et al.
patent: 5634082 (1997-05-01), Shimoyoshi et al.
patent: 5659659 (1997-08-01), Kolesnik et al.
patent: 5661822 (1997-08-01), Knowles et al.
patent: 5819215 (1998-10-01), Dobson et al.
patent: 5832443 (1999-11-01), Kolesnik et al.
patent: 5845243 (1998-12-01), Smart et al.
patent: 5896176 (1999-04-01), Das et al.
patent: 5909518 (1999-06-01), Chui
Boland and Deriche, “New Results In Low Bitrate Audio Coding Using a Combined Harmonic-Wavelet Representation,” 1997 IEEE Int'l Conf on Acoustics, Speech and Signal Processing, pp. 351-354 (Apr. 1997).
K. Brandenburg, et al. , “ASPEC: Adaptive Special Entropy Coding of High Qulaity Music Signals”, AES Preprint 301, 90thConvention, Paris, Feb. 1991.
K. Tsutsui et al., “ATRAC: Adaptive Transform Acoustic Coding For Minidisc”, AES Preprint 3456, 93rdConv. Audio Eng. Soc., Oct. 1992.
K. Brandenburg, G. Stoll: “The ISO/MEG—Audio Codes: A Generic Standard for Coding of High Quality Digital Audio”, AES Preprint 3336, 92thConvention, Vienna, Mar. 1992.
M.W. Marcellin, T.R. Fisher, “Trellis Coded Quantization of Memoryless and Gauss-Markov Sources”, IEEE Transactions of Communications, vol. 38, No. 1, Jan. 1990.
T. Berger, “Optimum Quantizers and Permutation Codes”, IEEE Transactions Information Theory, vol. IT-18, No. 6, Nov. 1972.
International Conference on Acoustis, Speech , and Signal Processing. ICASSP-97. Boland et al., :New results in low bitrate audio coding using a combined harmonic-wavelet representaion. vol. I, pp. 351-354, Apr. 1997.
Bocharova Irina E.
Kolesnik Victor D.
Kudryashov Boris D.
Ovsyannikov Eugene
Trofimov Andrei N.
Alaris Inc.
Blakely , Sokoloff, Taylor & Zafman LLP
Dorvil Richemond
LandOfFree
Audio compression and decompression employing subband... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Audio compression and decompression employing subband..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Audio compression and decompression employing subband... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2478763