Data processing: speech signal processing – linguistics – language – Audio signal time compression or expansion
Patent
1996-08-21
2000-08-08
Zele, Krista
Data processing: speech signal processing, linguistics, language
Audio signal time compression or expansion
G10L 2104
Patent
active
061014753
DESCRIPTION:
BRIEF SUMMARY
FIELD OF THE INVENTION
The present invention refers to a method for the cascaded coding and decoding of audio data.
In particular, the present invention refers to a method for the cascaded coding and decoding of audio data which is designed to improve the sound quality of a sound signal, generated on the basis of the audio data, following cascaded audio coding/decoding.
BACKGROUND OF THE INVENTION
In the cascaded coding and decoding of audio data, the spectral components of the short-time spectrum associated with a data block are formed within each codec stage of the cascade for each data block with a certain number of input data. A coded signal is then formed, by quantization and coding, on the basis of the spectral components for this data block and using a psycho-acoustic model to determine the bit distribution for the spectral components, whereupon time output data are obtained by decoding within the decoding part of the codec stage.
In the last few years, considerable advances have been made in the coding of sound signals with the least possible losses in sound quality. Such modern coding methods utilize the threshold of hearing of the human ear and try to adapt to the corresponding perceptual threshold the quantization noise generated by coding in such a way that, despite considerable data reduction, there is no audible deterioration. The coding and decoding devices which operate on this principle are also known as "perceptual codecs".
Such methods are suitable for a number of applications. They can be used to advantage practically everywhere where high-quality sound signals are to be stored or transmitted and the available capacity, e.g. the storage volume or the channel width, is to be used as effectively as possible.
Examples of such uses are the transmission of music over the ISDN telephone network, the storage of audio announcements or so-called "jingles" in flash ROM storage cards and the storage of music within music recorders using a so-called minidisk or the DCC method.
Examples of coding methods which use the above principle are those employed by the company Dolby Inc. under the names AC-2 and AC-3, the ATRAC method of the company Sony Corp. and the sound methods based on the standards ISO-MPEG (IS11172-3), layer 1-2-3.
All these processes are block-oriented, i.e. they analyse in each case a certain number of time input audio data, or audio sampling values, in other words a "data block", and determine for each block the spectral components present in the short-time spectrum associated with these data blocks. Afterwards the spectral components are quantized and coded, the coder employing a psycho-acoustic model to analyse the short-time spectrum so as to determine the bit distribution for the individual spectral components.
To summarize, one can also call this method "perceptual noise shaping": the noise induced by the quantization process is adapted to the perceptual threshold, the coder seeking to maintain a safe distance ("noise-to-mask ratio", NMR) from the estimated perceptual threshold.
In known methods for the coding and decoding of audio data, the sound quality of the audio signal on the output side deteriorates as the number of codec stages increases.
EP-A-0420745 contains a description of a coding device for generating digital audio signals in which the bandwidth for higher frequency ranges of the digital signals increases progressively and the coded signals for the various frequency ranges are formed in such a way that the number of sampled values within a block increases for higher frequency ranges. Quantization of the signals is achieved by assigning a certain number of bits to each of the bands.
The technical publication Alta Frequenza, Vol. XLVI, No. 8, August 1977, Milan, pp. 362-364 discusses the signal-noise ratio occurring in cascaded adaptive differential pulse code modulation codecs. The deterioration in the signal due to the multiple coding and decoding processes appears as a monotonically decreasing series.
SUMMARY OF THE INVENTION
The present invention furnishe
REFERENCES:
patent: 4361893 (1982-11-01), Bonnerot
patent: 4868867 (1989-09-01), Davidson et al.
patent: 4924480 (1990-05-01), Gay et al.
patent: 5189669 (1993-02-01), Masaktsu
patent: 5231492 (1993-07-01), Dangi et al.
patent: 5260980 (1993-11-01), Akagiri et al.
patent: 5323396 (1994-06-01), Lokhoff
patent: 5359626 (1994-10-01), Kloker et al.
patent: 5404377 (1995-04-01), Moses
patent: 5414795 (1995-05-01), Tsutsui et al.
patent: 5467086 (1995-11-01), Jeong
patent: 5471558 (1995-11-01), Tsutsui
patent: 5488665 (1996-01-01), Johnston et al.
patent: 5504832 (1996-04-01), Taguchi
patent: 5619197 (1997-04-01), Nakamura
patent: 5634082 (1997-05-01), Shimoyoshi et al.
patent: 5661755 (1997-08-01), Van De Kerkhof
AES Recommended Practice For Digital Audio Engineering--Serial Transmission Format For Two-Channel Linearly Represented Digital Audio Data, J. Audio Eng. Soc., vol. 40, No. 3, Mar. 1992, pp. 148-165.
P. Evans, "Digital Audio In The Broadcase Culture," EBU Review (European Broadcasting Union) Technical, No. 241-242, Jun./Aug. 1980, pp. 118-127.
T. Yagisawa, "Coding And Decoding Apparatus," Canon Co., Ltd., Japanese Patent Application Laying-Open Gazette No. 4-8064, Jan. 13, 1992 (Summary).
K. Sawada, "Variable-Length Coding and Decoding System," NTT Corp., Sony Corp., Japanese Patent Application Laying-Open Gazette No. 3-173224, Jul. 26, 1991 (Summary).
"AES Recommended practice for digital audio engineering--Serial transmission format for two-channel linearly represented digital audio data", J. Audio Eng. Soc., vol. 40, No. 3, 1992 Mar., 148-183.
Davidson, Grant et al, "Low-Complexity Transform Coder for Satellite Link Applications", Audio Engineering Society Preprint # 2966, Sep. 1990, 1-22.
Todd, Craig C., "Flexible Perceptual Coding for Audio Transmission and Storage", Audio Engineering Society Preprint # 3796, Feb. 26-Mar. 1, 1994, 1-16.
Brandenburg Karl-Heinz
Eberlein Ernst
Gerhauser Heinz
Keyhl Michael
Popp Harald
Fraunhofer-Gesellschaft zur Forderung der Angewandten Forschung
Opsasnick Michael N.
Zele Krista
LandOfFree
Method for the cascaded coding and decoding of audio data does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method for the cascaded coding and decoding of audio data, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for the cascaded coding and decoding of audio data will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-1159476