Audio coding and decoding methods and apparatuses and...

Data processing: speech signal processing – linguistics – language – Audio signal bandwidth compression or expansion

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S220000, C704S207000, C704S200100, C704S219000, C704S223000

Reexamination Certificate

active

06810381

ABSTRACT:

BACKGROUND OF THE INVENTION
The present invention relates to a method for encoding an input acoustic signal with a small amount of information by an audio coding scheme which determines codebook indices that will minimize an error between the input acoustic signal and a synthesized signal by its encoding, and a method for decoding the encoded information into the acoustic signal with high quality.
The CELP (Code Excited Linear Prediction) coding is a typical example of conventional low bit rate audio coding through a linear prediction (LP) coding scheme.
FIG. 1
is a block diagram for explaining the general outlines of the CELP coding scheme. An input acoustic signal is applied via an input terminal
11
to an LP coding part
12
, which performs an LPC analysis of the acoustic signal for each frame of about 5 to 20 ms to obtain p-th order linear predictive (LP) coefficients {circumflex over (&agr;)}
i
, where i=1, . . . , p. The LP coefficients {circumflex over (&agr;)}
i
are quantized in a quanization part
13
, and the resulting quantized LP coefficients {circumflex over (&agr;)}
i
are set as filter coefficients in an LP synthesis filter
14
. The transfer function of the LP synthesis filter
14
is expressed by the following Equation (1):
1
A

(
z
)
=
1
1
+

i
=
1
p



α
1

z
-
1
(
1
)
An excitation signal for the LP synthesis filter
14
is stored in an adaptive codebook
15
. The excitation signal (vector) is cut out of the adaptive codebook
15
in accordance with input codes from a control part
16
, and the cut-out segment (vector) is repeatedly duplicated and connected together to form a pitch component vector of one frame length. The pitch component vector is fed to a multiplier
22
, wherein it is multiplied by a gain g
1
selected from a gain codebook
17
, and the multiplier output is provided as the excitation signal to the synthesis filter via an adder
18
. A synthesized signal from the synthesis filter
14
is subtracted by a subtractor
19
from the input acoustic signal to generate an error signal. The error signal is provided to a perceptual weighting filter
20
, wherein the error signal is weighted corresponding to a masking effect by the perceptual characteristic. The control part
16
searches the adaptive codebook
15
for indices (i.e., a pitch lag) that will minimize the power of the weighted error signal. Thereafter, the control part
16
fetches noise vectors from a fixed codebook
21
in a sequential order. The noise vectors are each multiplied in a multiplier
23
by a gain g
2
selected from the gain codebook
17
, then each multiplier output is added by the adder with the pitch component vector previously selected from the adaptive codebook
15
then the adder output is applied as an excitation signal to the synthesis filter
14
, and as is the case with the above, the noise vectors are chosen which minimize the energy of the perceptually weighted error signal from the perceptual weighting filter
20
. Finally, for the respective excitation vectors selected from the adaptive and fixed codebooks
15
and
21
, the gain codebook
17
is searched for the gains g
1
, and g
2
, which are determined such that the powers of the outputs from the perceptual weighting filter
20
are minimized.
FIG. 2
is a block diagram for explaining the general outlines of a decoding scheme for the CELP coded acoustic signal. An LP coefficient code in input codes provided via an input terminal
31
is decoded in a decoding part
32
, and the quantized LP coefficients &agr;
i
obtained by this decoding are set as filter coefficients in an LP synthesis filter
33
. A pitch index in the input codes is used to cut out a pitch component vector from an adaptive codebook
34
, and a fixed codebook index is used to select random component vector from a fixed codebook
35
. The pitch component and random component vectors thus provided from the codebooks
34
and
35
are multiplied in multipliers
52
and
53
by gains g
1
and g
2
selected from a gain codebook
36
in accordance with a gain index in the input codes, thereafter being added together by an adder
37
, whose output is provided as an excitation signal to the LP synthesis filter
33
. A post filter processes a synthesized signal from the synthesis filter
33
in a manner to decrease quantization noise from the viewpoint of the perceptual characteristics, and provides the processed signal as a decoded acoustic signal to an output terminal
39
.
As described above, in the CELP or similar time-domain audio coding the conventional synthesis filter is formed by a 10th to 20th order LP auto-regressive linear filter for modeling the spectral envelope of speech, or its combination with a comb filter of a single pitch frequency modeled after a glottal source; hence, it is impossible to express a fine spectral structure of a musical sound which has many irregularly-spaced stationary peaks in the frequency domain. A method for reflecting the fine spectral structure in the synthesis filter is proposed by the inventors of this application in Japanese Patent Application Laid-Open Gazette No. 9-258795 and in literature “A 16 KBIT/S WIDEBAND CELP CODER WITH A HIGH-ORDER BACKWARD PREDICTOR AND ITS FAST COEFFICIENT CALCULATION,” IEEE, pp.107-108, 1997 (hereinafter referred to as Literature 1). According to the proposed method, the LP synthesis filter in
FIG. 1
is formed by a cascade connection of a p-th order (about 10th to 20th order, for instance) LP synthesis filter and a sufficiently higher n-th order LP synthesis filter. LP coefficients obtained by a p-th order linear prediction coding (LPC) analysis of the input signal is provided as coefficients of the p-th order LP synthesis filter, and LP coefficients obtained by an n-th order LPC analysis of a residual signal resulting from LP inverse filtering of a synthesized signal is provided as coefficients to the n-th order LP synthesis filter. With such a cascade-connected synthesis filters, it is possible to express the spectral envelope and fine structure of the input signal.
With the above method, in the coding apparatus of
FIG. 1
the LP synthesis filter
14
is formed by a cascade connection of a p-th order LP synthesis filter of relatively low order (a 10th to 20th order synthesis filter commonly used in conventional speech coding, hereinafter referred to as a low-order synthesis filter) and an n-th order LP synthesis filter (a 100th or higher order synthesis filer, hereinafter referred to as a high-order synthesis filter). The low-order synthesis filter is used to define the spectral envelope of the input acoustic signal, and the high-order synthesis filter is used to express the fine spectral structure of the synthesized signal that cannot fully be expressed with the p-th order coefficients. Hence, it is possible to achieve higher audio coding quality.
This method allows expressing the envelope of the fine spectral structure, and hence it permits high quality encoding of a signal which has such a fine spectral structure containing a plurality of pitches as that of a musical sound. However, the use of the high-order synthesis filter means to obtain in a average spectrum of input signal samples in a long analysis window, but on the other hand it is impossible to detect short-time variations in the spectral structure, for example, fine or minute changes in the pitches as in the case of speech. For this reason, when this method is applied to a signal that has a component abruptly changing with time, such as a human vocal codes vibration or musical attack sound, the audio coding quality is degraded by an echo-like noise.
In literature by the inventors of this application, “Wideband CELP Coding using Higher Order Backward Prediction of Residual,” Technical Report of IEICE, SP97-64, pp.51-56, November, 1997 (hereinafter referred to as Literature 2), there is disclosed a scheme which employs a synthesis filter formed by a cascade connection of high- and low-order synthesis filters as proposed in the afore-mentioned Japanese patent ap

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Audio coding and decoding methods and apparatuses and... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Audio coding and decoding methods and apparatuses and..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Audio coding and decoding methods and apparatuses and... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3286687

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.