Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission
Patent
1999-04-27
2000-05-02
Hudspeth, David R.
Data processing: speech signal processing, linguistics, language
Speech signal processing
For storage or transmission
704205, 704211, 704221, G10L 1902
Patent
active
060583614
DESCRIPTION:
BRIEF SUMMARY
FIELD OF INVENTION
The present invention relates to a system for the coding and decoding of a signal, especially of an audio-numerical digitized audio signal. These systems find their application in the slow thruput transmission of sound signals, with coding/decoding delay constraint as low as possible, imposed for example by the return of a control voice.
BACKGROUND OF THE INVENTION
During the transmission of digitized signals, the latter are numerically coded in the transmitter, then decoded in a receiver for their reproduction. The present invention deals with the antinomy between on the one hand, the search for a transmission quality that generally brings about, for a set rate of thruput, a relatively long coding and decoding delay and, on the other hand, the coding and decoding delay that, in some applications must be short.
In the present description, there is called coding/decoding delay the time length that separates the input of a sample into the coding device from the output of the corresponding sample at the decoding device. In order to be free from the particular execution of the coding process and/or from the structure of the circuits permitting this coding, it will be considered that the computations done at the time of these processes are infinitely fast in the coding as well as in the decoding machine. There are thus involved, in the computations of the coding/decoding time lag, only parameters such as the length of time for of acquiring numerical signal rasters, the delay imposed by a filter bank, and/or the time corresponding to a multiplexing of the samples.
In the case of a transform-type coding device, this delay will exceed the duration of a coded raster added to the delay developed by the transform. In the case of a low-delay coding device of the LD-CELP type, such as that described by J. H. Chen et al in the article titled "A low delay CELP coder for CCITT 16 kb/s speed coding standard", published in IEEE J. Sel. Areas Commun. Vol. 10, pp 830-849, the delay is linked to the five samples that constitute a basic raster. It will be noted that a coding diagram has a delay expressed in number of samples. In order to extract from this a time value, there must be brought into play the sampling frequency at which the coder is used, according to the relation:
As for the coding quality, this is a parameter difficult to define, knowing that the final receiver, that is to say the hearer's ear, cannot give precise quantitative results. Furthermore, measurements such as that of the signal to noise ratio, are not relevant because they do not take into account the psycho-acoustical masking properties of the auditory system. Statistical techniques such as those recommended by the notice ITU-R-BS-1116, permit to separate different coding algorithms with respect to coding quality.
It will be noted, however, that an improvement of the signal to noise ratio achieved on the frequency aggregate of the sound signal, makes it possible to ensure an improvement of the perceived quality.
The coding systems of generic audio-numerical signals, that is to say without hypothesis regarding the mode of production of these signals, until now, have not seriously considered as a constraint the matter of the signal reconstruction delay. One exception however is illustrated by the process described by F. Rumseyi in the article titled "Hearing both sides-stereo sound for TV in the UK" published in IEE review, vol. 36, No. 5, pp 173-176. In this process, however, the compression levels reached do not permit to compete with the coders with classical transforms.
Among the algorithms that are standardized by ISO (ISO/IEC 13818-3) the minimal reconstruction delays range from 18 ms for the simplest coder--and therefore the least efficient one--to more than 100 ms for the most complex coder. Other coding processes not standardized by ISO, such as the so-called ASPEC (Adaptative Spectral Perceptual Entropy coding) process described by K. Brandenburg et al, or the so-called ATRAC process (Adaptative Transform Acoustic Coding)
REFERENCES:
patent: 4956871 (1990-09-01), Swaminathan
patent: 5495552 (1996-02-01), Sugiyama
patent: 5630010 (1997-05-01), Sugiyama
Grant Davidson and Allen Gersho, "Multiple-Stage Vector Excitation Coding of Speech Waveforms," Proc. IEEE ICASSP 88, p. 163-166, Apr. 1988.
Bernhard Grill and Karlheinz Brandenburg, "A Two- or Three-Stage Bit Rate Scalable Audio Coding System," Proc. 99th Convention of the Audio Engineering Society, preprint 4132, p. 1-8, Oct. 1995.
France Telecom (SA)
Hudspeth David R.
Smits Talivaldis Ivars
Telediffuson De France SA
Whitesel J. Warren
LandOfFree
Two-stage Hierarchical subband coding and decoding system, espec does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Two-stage Hierarchical subband coding and decoding system, espec, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Two-stage Hierarchical subband coding and decoding system, espec will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-1601715