Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission
Reexamination Certificate
1999-08-09
2002-09-03
Chawan, Vijay B (Department: 2654)
Data processing: speech signal processing, linguistics, language
Speech signal processing
For storage or transmission
C704S230000, C704S501000, C704S200100, C704S226000, C375S240000, C375S253000, C381S022000, C381S023000
Reexamination Certificate
active
06446037
ABSTRACT:
TECHNICAL FIELD
The present invention relates to audio coding and decoding and relates more particularly to scalable coding of audio data into a plurality of layers of a standard data channel and scalable decoding of audio data from a standard data channel.
BACKGROUND ART
Due in part to the widespread commercial success of compact disc (CD) technologies over the last two decades, sixteen bit pulse code modulation (PCM) has become an industry standard for distribution and playback of recorded audio. Over much of this time period, the audio industry touted the compact disc as providing superior sound quality to vinyl records and cassette tapes, and many people believed that little audible benefit would be obtained by increasing the resolution of audio beyond that obtainable from sixteen bit PCM.
Over the last several years, this belief has been challenged for various reasons. The dynamic range of sixteen bit PCM is too limited for noise free reproduction of all musical sounds. Subtle detail is lost when audio is quantized to sixteen bit PCM. Moreover, the belief may fail to consider the practice of reducing quantization resolutions to provide additional headroom at the cost of reducing the signal-to-noise ratio and lowering signal resolution. Due to such concerns, there currently is strong commercial demand for audio processes that provide improved signal resolution relative to sixteen bit PCM.
There currently is also strong commercial demand for multi-channel audio. Multi-channel audio provides multiple channels of audio which can improve spatialization of reproduced sound relative to traditional mono and stereo techniques. Common systems provide for separate left and right channels both in front of and behind a listening field, and may also provide for a center channel and subwoofer channel. Recent modifications have provided numerous audio channels surrounding a listening field for reproducing or synthesizing spatial separation of different types of audio data.
Perceptual coding is one variety of techniques for improving the perceived resolution of an audio signal relative to PCM signals of comparable bit rate. Perceptual coding can reduce the bit rate of an encoded signal while preserving the subjective quality of the audio recovered from the encoded signal by removing information that is deemed to be irrelevant to the preservation of that subjective quality. This can be done by splitting an audio signal into frequency subband signals and quantizing each subband signal at a quantizing resolution that introduces a level of quantization noise that is low enough to be masked by the decoded signal itself. Within the constraints of a given bit rate, an increase in perceived signal resolution relative to a first PCM signal of given resolution can be achieved by perceptually coding a second PCM signal of higher resolution to reduce the bit rate of the encoded signal to essentially that of the first PCM signal. The coded version of the second PCM signal may then be used in place of the first PCM signal and decoded at the time of playback.
One example of perceptual coding is embodied in devices that conform to the public ATSC AC-3 bitstream specification as specified in the Advanced Television Standards Committee (ATSC) A52 document (1994). This particular perceptual coding technique as well as other perceptual coding techniques are embodied in various versions of Dolby Digital® coders and decoders. These coders and decoders are commercially available from Dolby Laboratories, Inc. of San Francisco, California. Another example of a perceptual coding technique is embodied in devices that conform to the MPEG-1 audio coding standard ISO 11172-3 (1993).
One disadvantage of conventional perceptual coding techniques is that the bit rate of the perceptually coded signal for a given level of subjective quality may exceed the available data capacity of communication channels and storage media. For example, the perceptual coding of a twenty-four bit PCM audio signal may yield a perceptually coded signal that requires more data capacity than is provided by a sixteen bit wide data channel. Attempts to reduce the bit rate of the encoded signal to a lower level may degrade the subjective quality of audio that can be recovered from the encoded signal. Another disadvantage of conventional perceptual coding techniques is that they do not support the decoding of a single perceptually coded signal to recover an audio signal at more than one level of subjective quality.
Scalable coding is one technique that can provide a range of decoding quality. Scalable coding uses the data in one or more lower resolution codings together with augmentation data to supply a higher resolution coding of an audio signal. Lower resolution codings and the augmentation data may be supplied in a plurality of layers. There is also strong need for scalable perceptual coding, and particularly, for scalable perceptual coding that is backward compatible at the decoding stage with commercially available sixteen bit digital signal transport or storage means.
DISCLOSURE OF INVENTION
Scalable audio coding is disclosed that supports coding of audio data into a core layer of a data channel in response to a first desired noise spectrum. The first desired noise spectrum preferably is established according to psychoacoustic and data capacity criteria. Augmentation data may be coded into one or more augmentation layers of the data channel in response to additional desired noise spectra. Alternative criteria such as conventional uniform quantization may be utilized for coding augmentation data.
Systems and methods for decoding just a core layer of a data channel are disclosed. Systems and methods for decoding both a core layer and one or more augmentation layers of a data channel are also disclosed, and these provide improved audio quality relative to that obtained by decoding just the core layer.
Some embodiments of the present invention are applied to subband signals. As is understood in the art, subband signals may be generated in numerous ways including the application of digital filters such as the quadrature mirror filter, and by a wide variety of time-domain to frequency-domain transforms and wavelet transforms.
Data channels employed by the present invention preferably have a sixteen bit wide core layer and two four bit wide augmentation layers conforming to standard AES3 which is published by the Audio Engineering Society (AES). This standard is also known as standard ANSI S4.40 by the American National Standard Institute (ANSI). Such a data channel is referred to herein as a standard AES3 data channel.
Scalable audio coding and decoding according to various aspects of the present invention can be implemented by discrete logic components, one or more ASICs, program-controlled processors, and by other commercially available components. The manner in which these components are implemented is not important to the present invention. Preferred embodiments use program-controlled processors, such as those in the DSP563xx line of digital signal processors from Motorola. Programs for such implementations may include instructions conveyed by machine readable media, such as, baseband or modulated communication paths and storage media. Communication paths preferably are in the spectrum from supersonic to ultraviolet frequencies. Essentially any magnetic or optical recording technology may be used as storage media, including magnetic tape, magnetic disk, and optical disc.
According to various aspects of the present invention, audio information coded according to the present invention can be conveyed by such machine readable media to routers, decoders, and other processors, and may be stored by such machine readable media for routing, decoding, or other processing at later times. In preferred embodiments, audio information is coded according to the present invention, and stored on machine readable media, such as compact disc. Such data preferably is formatted in accordance with various frame and/or other disclosed data structures. A decoder can
Fielder Louis Dunn
Vernon Stephen Decker
Chawan Vijay B
Dolby Laboratories Licensing Corporation
LandOfFree
Scalable coding method for high quality audio does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Scalable coding method for high quality audio, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Scalable coding method for high quality audio will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2873293