Speech recognition apparatus for AV equipment

Data processing: speech signal processing – linguistics – language – Speech signal processing – Application

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S226000, C704S233000

Reexamination Certificate

active

06665645

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to speech recognition apparatuses, and more specifically, to a speech recognition apparatus used for AV equipment such as a TV, radio, and audio system that reproduces multichannel audio including two-channel stereo, capable of controlling the AV equipment through voice, inputting information to the AV equipment through voice, and carrying out other operations even if audio is reinforced by loudspeakers.
2. Description of the Background Art
A conventional speech recognition technique with audio reinforced by a loudspeaker is exemplarily disclosed in Japanese Patent Laid-Open Publication No. 5-22779 (1993-22779) (Title: SPEECH RECOGNITION REMOTE CONTROLLER).
FIG. 23
is a block diagram showing the configuration of a conventional speech recognition apparatus for AV equipment using the technique disclosed in the above publication. The speech recognition apparatus of
FIG. 23
is used for AV equipment with a single loudspeaker
201
. In
FIG. 23
, the conventional speech recognition apparatus includes a microphone
202
, a speech recognition unit
203
, and an echo canceller
204
.
With reference to
FIG. 24
, the operation of the above-configured conventional speech recognition apparatus for AV equipment is now described.
FIG. 24
is a diagram showing time waveforms of signals inputted to or outputted from the components of the speech recognition apparatus of FIG.
23
. In
FIG. 24
, consider the case where a user speaks to control speech while audio is reinforced by the loudspeaker
201
.
When the user speaks without the audio being reinforced by the loudspeaker
21
, a speech signal outputted from the microphone
202
is extremely good in S/N ratio, as indicated by a reference numeral
211
in FIG.
24
. When an audio signal
212
for a TV program is inputted to the loudspeaker
201
, an echo signal
213
that is similar to the loudspeaker input
212
is mixed into an output from the microphone
202
.
Therefore, the microphone
202
outputs a signal with the user's speech
211
and the echo signal
213
mixed therein, as indicated by a reference numeral
214
of FIG.
24
. This signal is too low in S/N ratio for recognition of the user's speech. Naturally, with such microphone output
214
, sufficient speech recognition results by the speech recognition unit
203
cannot be expected.
Thus, in the speech recognition apparatus of
FIG. 23
, the echo signal
213
echoed to the microphone
202
from the loudspeaker
201
is estimated by an adaptive digital filter provided in the echo canceller
204
. A subtraction circuit in the echo canceller
204
subtracts the estimated echo signal from the microphone output
214
to totally cancel out the echo signal
213
, thereby extracting only the user's speech
211
.
The echo canceller
204
is provided with the loudspeaker input
212
, which is an input signal to the loudspeaker
201
. The adaptive digital filter in the echo canceller
204
estimates an echo signal
215
from the waveform of the loudspeaker input
212
and an impulse response from the loudspeaker
201
through the microphone
202
that is stored therein. Then, the subtraction circuit provided in the echo canceller
204
subtracts the estimated echo signal
215
from the microphone output
214
to obtain an echo canceller output
216
.
As known from the comparison between the echo canceller output
216
and the waveform of the user's speech
211
, the speech recognition unit
203
can be expected to carry out correct speech recognition under the action of echo cancellation by the echo canceller
204
even when audio is reinforced by the loudspeaker
201
.
However, the audio recognition apparatus of
FIG. 23
supports only monaural AV equipment, and cannot be used for multichannel AV equipment using a plurality of loudspeakers.
FIG. 25
is a block diagram showing the configuration of another conventional speech recognition apparatus for AV equipment. The speech recognition apparatus of
FIG. 25
is used for 2-channel AV equipment with two loudspeakers
221
and
222
.
In this speech recognition apparatus, sound echoed from the loudspeaker
221
to the microphone
223
and sound echoed from the loudspeaker
222
to the microphone
223
are estimated by adaptive digital filters in the echo cancellers
225
and
226
. By subtracting the estimated values from the output signal from the microphone, only user's speech can be extracted. Unlike the speech recognition apparatus of
FIG. 23
, the speech recognition apparatus of
FIG. 25
is adaptable to stereo AV equipment.
The speech recognition apparatus of
FIG. 25
, however, requires as many echo cancellers as audio channels. Therefore, it becomes too costly for use in multichannel AV equipment. Moreover, in such system using a plurality of echo cancellers, mutual interference among the echo cancellers occurs, resulting in major drawbacks such as instability in adaptive operation of each echo canceller, an increase in echo and oscillation due to failure in adaptation.
It is strongly desired that speech recognition apparatuses for AV equipment should carry out speech recognition while reproducing audio through a loudspeaker, support multichannel audio, ensure high reliability, and have a low price.
However, as described above, the conventional speech recognition apparatuses require as many echo cancellers as audio channels. Therefore, they become too costly for use in multichannel AV equipment.
Furthermore, mutual interference among the echo cancellers makes adaptive operation of each echo canceller extremely unstable, thereby causing an increase in echo and oscillation due to failure in adaptation, and as a result, decreasing speech recognition performance.
SUMMARY OF THE INVENTION
Therefore, an object of the present invention is to achieve a low-cost speech recognition apparatus for multichannel AV equipment capable of speech recognition with high accuracy while multichannel sound is being produced from loudspeakers.
The present invention has the following features to solve the problems above.
A first aspect of the present invention is directed to a speech recognition apparatus used for AV equipment outputting multichannel sound through a plurality of loudspeakers, capable of recognizing user's speech inputted through a microphone and causing the AV equipment to perform a predetermined process, the apparatus comprising:
a monaural conversion part for converting multichannel signals to the plurality of loudspeakers into a monaural signal;
a single echo canceller, provided with an output from the microphone (microphone output) and an output from the monaural conversion part (monaural output), for estimating echo sound of the multichannel sound based on the monaural signal and eliminating the echo sound from the microphone output; and
a speech recognition part for recognizing the user's speech based on an output from the single echo canceller (echo canceller output).
In the first aspect, the multichannel signals are converted into a monaural signal, which is provided to the single echo canceller. The single echo canceller eliminates echo sound of multichannel sound from the microphone output. Therefore, with only a single echo canceller, speech recognition can be carried out while multichannel sound is produced from the loudspeakers irrespectively of the number of channels. Furthermore, unlike the case where a plurality of echo cancellers are provided, the present invention can prevent mutual interference among the echo cancellers that leads to deterioration in speech recognition performance.
According to a second aspect, in the first aspect, the multichannel signals are provided to the plurality of loudspeakers.
In the second aspect, multichannel sound is produced from the plurality of loudspeakers. Therefore, echo sound cannot be completely cancelled out with the monaural signal. However, if a monaural level of the multichannel signals is closer to 1, echo sound can be cancelled out for the most part. At least p

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Speech recognition apparatus for AV equipment does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Speech recognition apparatus for AV equipment, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Speech recognition apparatus for AV equipment will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3107458

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.