Device and method of channel effect compensation for...

Telephonic communications – Audio message storage – retrieval – or synthesis – Voice activation or recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C379S088070, C379S201010, C379S406010, C379S406100, C379S088030

Reexamination Certificate

active

06456697

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the invention
The present invention generally relates to a telephone speech recognition system. More particularly, the present invention relates to device and method of channel effect compensation for telephone speech recognition.
2. Description of the Related Art
In a speech recognition application via a telephone network, speech signals are Inputted from the handset of a telephone and transmitted through a telephone line to a remote speech recognition system for recognition. Therein, the path speech signals pass includes the telephone handset and the telephone line, which are referred to as a “telephone channel or channel”. In terms of signal transmission, the characteristic of the telephone channel will affect the speech signals during transmission, referred as a “telephone channel effect or channel effect”. Mathematically, impulse response of the telephone channel is introduced with a convolved component into speech signals.
FIG. 1
is a diagram illustrating a typical telephone speech recognition system. As shown in the
FIG. 1
, a speech signal x(t) sent by the calling part becomes a telephone speech signal y(t) after passing through the telephone channel
1
comprising the telephone handset and the telephone line, and is inputted to the recognition system
10
for further processing. The recognition result R is generated by the recognition system
10
. Here, assume the impulse response of the telephone channel
10
to be h(t), then the relationship between the speech signal x(t) and the telephone speech signal y(t) can be represented by:
y
(
t
)=
x
(
t
){circle around (x)}
h
(
t
)  (1)
where symbol “{circle around (x)}” represents the convolution operator. Most importantly, the impulse response h(t) in the telephone channel
1
varies with the caller's handset and the transmits son path of speech signals in a telephone network (the transmission path determined by switching equipment). In other words, the same phone call (the same speech signal x(t)) will generate different telephone speech signals y(t) through different telephone channels (different impulse responses h(t). This environmental variation will affect the recognition ate of the recognition system
10
. Therefore, compensation of telephone channel effect should be performed before undergoing telephone speech recognition to reduce such environmental variation.
The principle of typical telephone channel effect compensation will be briefly described in the following. Equation (1) represents the relationship between the speech signal x(t) and the telephone speech signal y(t) in time domain. If equation (1) is transformed to the spectral domain, then it can be represented by:
Y
(
f
)=
X
(
f
)•|
H
(
f
)|
2
  (2)
where X(f) and Y(f) represent the power spectra of the speech signal x(t) and the telephone speech signal y(t), respectively, and H(f) represents the transfer function of the telephone channel
1
.
The following logarithm spectral relation is obtained after processing the bilateral logarithms of equation (2):
log[
Y
(
f
)]=log[
X
(
f
)]+log└|
H
(
f
)|
2
¦  (3)
The following will be obtained when inverse Fourier transformation Is used for projecting equation (3) on a cepstral domain:

c
y
(&tgr;)=
c
x
(&tgr;)+
c
h
(&tgr;)  (4)
where c
x
(&tgr;), c
y
(&tgr;), and c
h
(&tgr;) are the respective cepstral vectors of x(t), y(t), and h(t).
From equations (3) and 4), in logarithmic spectral and cepstral domain, the influence of the telephone channel upon the speech signals in transmission can be described with a bias. Therefore, most of the current telephone channel effect compensation means are developed and based upon such a principle. The difference lies in the bias estimation method and bias elimination method.
FIG. 2
is a block diagram illustrating a conventional telephone speech recognition system. As shown in the figure, the telephone speech recognition system comprises a feature analysis section
100
, a channel effect compensation section
102
and a recognizer
104
(comprising a speech recognition section
104
a
for speech recognition and acoustic models
104
b
feature analysis section
100
first blocks the received telephone speech signal y(t) into frames, performs feature analysis on each telephone speech frame, and generates a corresponding feature vector o(t). in accordance with the description of the above equations (3) and (4), the feature vector o(t) may be a logarithmic spectral vector or a cepstral vector. Channel effect compensation section
102
subsequently performs compensation of the feature vector o(t), and the generated feature vector ô(t) is inputted to the recognizer
104
. Speech recognition section
104
a
performs the actual speech recognition according to the acoustic models
104
b
and generates the desired recognition result R. The three most popular telephone channel effect compensation techniques are the following: the relative spectral technique (RASTA), the cepstral mean normalization (CMN), and the signal bias removal (SBR) . The first technique adopts a fixed filter type, whereas the last two techniques calculate the bias from feature vectors of a telephone speech signal. These conventional techniques will be briefly described in the following references, the content of which is expressly incorporated herein by reference.
(A) RASTA: Refer H. Hermansky, N. Morgan, “RASTA processing of speech” HEEE Trans. On Speech and Audio Processing, vol. 2, pp.578-589, 1994 for derails. The operation of RASTA makes use of filters Go eliminate low-frequency components contained in the logarithmic spectral vectors or cepstral vectors, that is, the bias introduced by the telephone channel, for the purpose of the channel effect compensation. According to aforementioned analysis, bandpass infinite impulse response (IIR) filters expressed by the following equation (5) can perform quite well.
H

(
z
)
=
0

:

1
×
1
+
z
-
1
-
z
-
3
-
2

z
-
4
z
-
4

(
1
-
0.98

z
-
1
)
(
5
)
The purposes of using a bandpass filter are twofold: firstly, for filtering out the bias by highpass filtering; and secondly, for smoothing the rapidly changing spectra by lowpass filtering. If only the telephone channel effect compensation is considered, only highpass filtering need be used. At this time, the transfer function of the highpass filter can be represented as follows:
H

(
z
)
=
1
-
z
-
1
1
-
(
1
-
λ
)

z
-
1
(
6
)
RASTA has demonstrated its advantage in that it can be easily realized without causing response time delay problems, however, its disadvantage is that the range of the frequency band of the filter is predetermined and cannot be adjusted with the inputted telephone speech signal. Therefore, some useful speech information may be also deleted when the bias introduced by the telephone channel effect is filtered out; the recognition result will then be affected. As a result, the recognition result of a telephone speech recognition system obtained with RASTA compensation method is less effective than those obtained by CMN and SBR compensation methods.
(B) CMN : Refer F. Liu, R. M. Stern, X. Huang and A. Acero, “Efficient cepstral normalization for robust speech recognition,” Proc. Of Human Language Technology, pp.69-74, 1993 for details. The operation of CMN is to estimate the bias representing the characteristic of the telephone channel and to eliminate the bias from the logarithmic spectral vectors or cepstral vectors of the telephone speech signal. In CMN, a bias is represented by the cepstral mean vector of telephone speech signals. Since the bias is estimated from telephone speech signals, the telephone channel characteristic can be acquired and a better compensation can be obtained. However, CMN is performed by assuming the cepstral mean vector of the speech signal before passing the telephone channel to be a zero vector. Experimental results have demonstrated that suc

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Device and method of channel effect compensation for... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Device and method of channel effect compensation for..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Device and method of channel effect compensation for... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2907970

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.