Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission
Reexamination Certificate
1999-11-15
2002-02-05
Tsang, Fan (Department: 2645)
Data processing: speech signal processing, linguistics, language
Speech signal processing
For storage or transmission
Reexamination Certificate
active
06345247
ABSTRACT:
TECHNICAL FIELD
The present invention relates to an excitation vector generator capable of obtaining a high-quality synthesized speech, and a speech coder and a speech decoder which can code and decode a high-quality speech signal at a low bit rate.
BACKGROUND ART
A CELP (Code Excited Linear Prediction) type speech coder executes linear prediction for each of frames obtained by segmenting a speech at a given time, and codes predictive residuals (excitation signals) resulting from the frame.-by-frame linear prediction, using an adaptive codebook having old excitation vectors stored therein and a random codebook which has a plurality of random code vectors stored therein. For instance, “Code-Excited Linear Prediction(CELP):High-Quality Speech at Very Low Bit Rate,” M. R. Schroeder, Proc. ICASSP '85, pp. 937-940 discloses a CELP type speech coder.
FIG. 1
illustrates the schematic structure of a CELP type speech coder. The CELP type speech coder separates vocal information into excitation information and vocal tract information and codes them. With regard to the vocal tract information, an input speech signal
10
is input to a filter coefficients analysis section
11
for linear prediction and linear predictive coefficients (LPCs) are coded by a filter coefficients quantization section
12
. Supplying the linear predictive coefficients to a synthesis filter
13
allows vocal tract information to be added to excitation information in the synthesis filter
13
. With regard to the excitation information, excitation vector search in an adaptive codebook
14
and a random codebook
15
is carried out for each segment obtained by further segmenting a frame (called subframe). The search in the adaptive codebook
14
and the search in the random codebook
15
are processes of determining the code number and gain (pitch gain) of an adaptive code vector, which minimizes coding distortion in an equation 1, and the code number and gain (random code gain) of a random code vector.
∥v−(gaHp+gcHc)∥
2
(1)
V: speech signal (vector)
H: impulse response convolution matrix of the
H
=
[
h
⁡
(
0
)
0
⋯
⋯
0
0
h
⁡
(
1
)
h
⁡
(
0
)
0
⋯
0
0
h
⁡
(
2
)
h
⁡
(
1
)
h
⁡
(
0
)
0
0
0
⋮
⋮
⋮
⋰
0
0
⋮
⋮
⋮
⋰
h
⁡
(
0
)
0
h
⁡
(
L
-
1
)
⋯
⋯
⋯
h
⁡
(
1
)
h
⁡
(
0
)
]
synthesis filter.
where h: impulse response (vector) of the synthesis filter
L: frame length
p: adaptive code vector
c: random code vector
ga: adaptive code gain (pitch gain)
gc: random code gain
Because a closed loop search of the code that minimizes the equation 1 involves a vast amount of computation for the code search, however, an ordinary CELP type speech coder first performs adaptive codebook search to specify the code number of an adaptive code vector, and then executes random codebook search based on the searching result to specify the code number of a random code vector.
The speech coder search by the CELP type speech coder will now be explained with reference to
FIGS. 2A through 2C
. In the figures, a code x is a target vector for the random codebook search obtained by an equation 2. It is assumed that the adaptive codebook search has already been accomplished.
x=v−gaHp (2)
where x: target (vector) for the random codebook search
V: speech signal (vector)
H: impulse response convolution matrix H of the synthesis filter
p: adaptive code vector
ga: adaptive code gain (pitch gain)
The random codebook search is a process of specifying a random code vector c which minimizes coding distortion that is defined by an equation 3 in a distortion calculator
16
as shown in FIG.
2
A.
∥x−gcHc∥
2
(3)
where x: target (vector) for the random codebook search
H: impulse response convolution matrix of the synthesis filter
c: random code vector
gc: random code gain.
The distortion calculator
16
controls a control switch
21
to switch a random code vector to be read from the random codebook
15
until the random code vector c is specified.
An actual CELP type speech coder has a structure in
FIG. 2B
to reduce the computational complexities, and a distortion calculator
16
′ carries out a process of specifying a code number which maximizes a distortion measure in an equation 4.
(
x
t
⁢
Hc
)
2
&LeftBracketingBar;
&RightBracketingBar;
⁢
Hc
⁢
&LeftBracketingBar;
&RightBracketingBar;
2
=
(
(
x
t
⁢
H
)
⁢
c
)
2
&LeftBracketingBar;
&RightBracketingBar;
⁢
Hc
⁢
&LeftBracketingBar;
&RightBracketingBar;
2
=
(
x
′
⁢
⁢
t
⁢
c
)
2
&LeftBracketingBar;
&RightBracketingBar;
⁢
Hc
⁢
&LeftBracketingBar;
&RightBracketingBar;
2
=
(
x
′
⁢
⁢
t
⁢
c
)
2
c
t
⁢
H
t
⁢
Hc
(
4
)
where x: target (vector) for the random codebook search
H: impulse response convolution matrix of the synthesis filter
H
t
: transposed matrix of H
X
t
: time reverse synthesis of x using H (x′
t
=x
t
H)
c: random code vector.
Specifically, the random codebook control switch
21
is connected to one terminal of the random codebook
15
and the random code vector c is read from an address corresponding to that terminal. The read random code vector c is synthesized with vocal tract information by the synthesis filter
13
, producing a synthesized vector Hc. Then, the distortion calculator
16
′ computes a distortion measure in the equation 4 using a vector x′ obtained by a time reverse process of a target x, the vector Hc resulting from synthesis of the random code vector in the synthesis filter and the random code vector c. As the random codebook control switch
21
is switched, computation of the distortion measure is performed for every random code vector in the random codebook.
Finally, the number of the random codebook control switch
21
that had been connected when the distortion measure in the equation 4 became maximum is sent to a code output section
17
as the code number of the random code vector.
FIG. 2C
shows a partial structure of a speech decoder. The switching of the random codebook control switch
21
is controlled in such a way as to read out the random code vector that has a transmitted code number. After a transmitted random code gain gc and filter coefficient are set in an amplifier
23
and a synthesis filter
24
, a random code vector is read out to restore a synthesized speech.
In the above-described speech coder/speech decoder, the greater the number of random code vectors stored as excitation information in the random codebook
15
is, the more possible it is to search a random code vector close to the excitation vector of an actual speech. As the capacity of the random codebook (ROM) is limited, however, it is not possible to store countless random code vectors corresponding to all the excitation vectors in the random codebook. This restricts improvement on the quality of speeches.
Also has proposed an algebraic excitation which can significantly reduce the computational complexities of coding distortion in a distortion calculator and can eliminate a random codebook (ROM) (described in “8 KBIT/S ACELP CODING OF SPEECH WITH 10 MS SPEECH-FRAME: A CANDIDATE FOR CCITT STANDARDIZATION”: R. Salami, C. Laflamme, J-P. Adoul, ICASSP '94, pp. II-97 to II-100, 1994).
The algebraic excitation considerably reduces the complexities of computation of coding distortion by previously computing the results of convolution of the impulse response of a synthesis filter and a time-reversed target and the autocorrelation of the synthesis filter and developing them in a memory. Further, a ROM in which random code vectors have been stored is eliminated by algebraically generating random code vectors. A CS-ACELP and ACELP which use the algebraic excitation have been recommended respectively as G. 729 and G. 723.1 from the ITU-T.
In the CELP type speech coder/speech decoder equipped with the above-described algebraic excitation in a random codebook section, however, a target for a random codebook search
Ehara Hiroyuki
Morii Toshiyuki
Yasunaga Kazutoshi
Greenblum & Bernstein P.L.C
Matsushita Electric - Industrial Co., Ltd.
Opsasnick Michael N.
Tsang Fan
LandOfFree
Excitation vector generator, speech coder and speech decoder does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Excitation vector generator, speech coder and speech decoder, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Excitation vector generator, speech coder and speech decoder will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2977895