Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission
Reexamination Certificate
1999-10-21
2002-03-05
{haeck over (S)}mits, T{overscore (a)}livaldis Ivars (Department: 2741)
Data processing: speech signal processing, linguistics, language
Speech signal processing
For storage or transmission
C704S201000, C704S211000
Reexamination Certificate
active
06353808
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to an apparatus and a method for encoding a signal by quantizing an input signal through time base/frequency base conversion as well as to an apparatus and a method for decoding an encoded signal. More particularly, the present invention relates to an apparatus and a method for encoding a signal that can be suitably used for encoding audio signals in a highly efficient way. It also relates to an apparatus and a method for decoding an encoded signal.
2. Prior Art
Various methods for encoding an audio signal are known to date including those adapted to compress the signal by utilizing statistic characteristics of audio signals (including voice signals and music signals) in terms of time and frequency and characteristic traits of the human hearing sense. Such coding methods can be roughly classified into encoding in the time region, encoding in the frequency region and analytic/synthetic encoding.
In the operation of transform coding of encoding an input signal on the time base by orthogonally transforming it into a signal on the frequency base, it is desirable from the viewpoint of coding efficiency that the characteristics of the time base waveform of the input signal are removed before subjecting it to transform coding.
Additionally, when quantizing the coefficient data on the orthogonally transformed frequency base, the data are more often than not weighted for bit allocation. However, it is not desirable to transmit the information on the bit allocation as additional information or side information because it inevitably increases the bit rate.
In view of these circumstances, it is therefore an object of the present invention to provide an apparatus and a method for encoding a signal that are adapted to remove the characteristic or correlative aspects of the time base waveform prior to orthogonal transform in order to improve the coding efficiency and, at the same time, reduce the bit rate by making the corresponding decoder able to know the bit allocation without directly transmitting the information on the bit allocation used for the quantizing operation.
Meanwhile, for the operation of transform coding of encoding an input signal on the time base by orthogonally transforming it into a signal on the frequency base, techniques have been proposed to quantize the coefficient data on the frequency base by dynamically allocating bits in response to the input signal in order to realize a low coding rate. However, cumbersome arithmetic operations are required for the bit allocation particularly when the bit allocation changes for each coefficient in the operation of dividing coefficient data on the frequency base in order to produce sub-vectors for vector quantization.
Additionally, the reproduced sound can become highly unstable when the bit allocation changes extremely for each frame that provides a unit for orthogonal transform.
In view of these circumstances, it is therefore another object of the present invention to provide an apparatus and a method for encoding a signal that are adapted to dynamically allocate bits in response to the input signal with simple arithmetic operations for the bit allocation and reproduce sound without making it unstable if the bit allocation changes remarkably among frames for the operation of encoding the input signal that involves orthogonal transform as well as an apparatus and a method for decoding a signal encoded by such an apparatus and a method.
Additionally, since quantization takes place after the bit allocation for the coefficient on the frequency base such as the MDCT coefficient in the operation of transform coding of encoding an input signal on the time base by orthogonally transforming it into a signal on the frequency base, quantization errors spreads over the entire orthogonal transform block length on the time base to give rise to harsh noises such as pre-echo and post-echo. This tendency is particularly remarkable for sounds that relatively quickly attenuate between pitch peaks. This problem is conventionally addressed by switching the transform window size (so-called window switching). However, this technique of switching the transform window size involves cumbersome processing operations because it is not easy to detect the right window having the right size.
In view of the above circumstances, it is therefore still another object of the present invention to provide an apparatus and a method for encoding a signal adapted to reduce harsh noises such as pre-echo and post-echo without modifying the transform window size as well as an apparatus and a method for decoding a signal encoded by such an apparatus and a method.
SUMMARY OF THE INVENTION
According to a first aspect of the invention, the above objectives are achieved by providing a method for encoding an input signal on the time base through orthogonal transform, said method comprising:
a step of removing the correlation of signal waveform on the basis of the parameters obtained by means of linear predictive coding (LPC) analysis and pitch analysis of the input signal on the time base prior to the orthogonal transform.
Preferably, the input time base signal is transformed to coefficient data on the frequency base by means of modified discrete cosine transform (MDCT) in said orthogonal transform step. Preferably, in said normalization step, the LPC analysis residue of said input signal is output on the basis of the LPC coefficient obtained through LPC analysis of said input signal and the correlation of the pitch of said LPC prediction residue is removed on the basis of the parameters obtained through pitch analysis of said LPC prediction residue. Preferably, said quantization means quantizes according to the number of allocated bits determined on the basis of the outcome of said LPC analysis and said pitch analysis.
According to a second aspect of the invention, there is provided a method for encoding an input signal on the time base through orthogonal transform, said method comprising:
a calculating step of calculating weights as a function of said input signal; and
a quantizing step of determining an order for the coefficient data obtained through the orthogonal transform according to the order of the calculated weights and carrying out an accurate quantizing operation according to the determined order.
Preferably, in said quantizing step, a larger number of allocated bits are used for quantization for the coefficient data of a higher order.
Preferably, the coefficient data obtained through said orthogonal transform are divided into a plurality of bands on the frequency base and the coefficient data of each of the bands are quantized according to said determined order of said weights independently from the remaining bands.
Preferably, the coefficient data of each of the bands are divided into a plurality of groups in the descending order of the bands to defines respective coefficient vectors and each of the obtained coefficient vectors is subjected to vector quantization.
According to a third aspect of the invention, there is provided a method for encoding an input signal on the time base through orthogonal transform on a frame by frame basis, each frame providing a coding unit, said method comprising:
an envelope extracting step of an extracting envelope within each frame of said input signal; and
a gain smoothing step of carrying out a gain smoothing operation on said input signal on the basis of the envelope extracted by said envelope extracting step and supplying the input signal for said orthogonal transform.
Preferably, the input time base signal is transformed to coefficient data on the frequency base by means of modified discrete cosine transform (MDCT) for said orthogonal transform. Preferably, the information on said envelope is quantized and output. Preferably, said frame is divided into a plurality of sub-frames and said envelope is determined as the root means square (rms) value of each of the divided sub-frames. Preferably, the rms value of each of the divided sub-frames is quanti
Makino Kenichi
Matsumoto Jun
Nishiguchi Masayuki
Maioli Jay B.
Nolan Daniel A
Sony Corporation
{haeck over (S)}mits T{overscore (a)}livaldis Ivars
LandOfFree
Apparatus and method for encoding a signal as well as... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Apparatus and method for encoding a signal as well as..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Apparatus and method for encoding a signal as well as... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2839263