HMM-based echo model for noise cancellation avoiding the...

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S226000, C704S233000

Reexamination Certificate

active

06606595

ABSTRACT:

TECHNICAL FIELD
This invention relates to automatic speech recognition and more particularly to automatic speech recognition methods and systems for interacting with telephone callers cross a network.
BACKGROUND OF THE INVENTION
In telephony, especially mobile telephony, speech signals are often degraded by the presence of acoustic background noise as well as by system introduced interference. Such degradations have an adverse effect on both the perceived quality and the intelligibility of speech, as well as on the performance of speech processing applications in the network. To improve the perceived speech quality, noise reduction algorithms are implemented in cellular handsets, often in conjunction with network echo cancellers as shown by an article of E. J. Diethorn, “A subband noise-reduction method for enhancing speech in telephony and teleconferencing,” IEEE Workshop on Applications of Signal Processing to Audio and Acoustics,” 1997. The most common methods for noise reduction assume that acoustic noise and speech are picked up by one microphone. These methods are mostly based on spectral magnitude subtraction where the short-term spectral amplitude of noise is estimated during speech pauses and subtracted from the noisy microphone signal as shown in the article of J. S. Lim and A. V. Oppenheim, “Enhancement and bandwidth compression of noisy speech,” Proc. IEEE, Vol. 67, pp. 1586-1604, 1979. The spectral magnitude subtraction method inherently implies the use of a voice activity detector (VAD) to determine at every frame whether there is speech present in that frame, such as found in U.S. Pat. No. 5,956,675, which is hereby incorporated by reference, and a related article by A. R. Setlur and R. A. Sukkar, entitled “Recognition-based word counting for reliable barge-in and early endpoint detection in continuous speech recognition,” Proc. ICSLP, pp. 823-826, 1998. The performance of these methods depends a great deal on the efficacy of the VAD. Even though about 12 to 18 dB of noise reduction can be achieved in real-world settings, spectral subtraction can produce musical tones and other artifacts which further degrade speech recognition performance as indicated in an article by C. Mokbel and G. Chollet, “Automatic word recognition in cars,” IEEE Transactions on Speech and Audio Processing, Vol. 3, No. 5, pp. 346-356, 1995.
When the interference causing the degradation is an echo of the system announcement, then echo cancellers are able to reduce this type of echo by up to about 25 dB and such echo cancellers generate very few artifacts. However, if the echo is loud and the incoming speech is quiet, the residual echo energy following cancellation still begins to approach the level of the incoming speech. Such echoes sometimes cause false triggering of automatic speech recognition, especially in systems that allow users to interrupt prompts with spoken input. It is desirable to reduce or remove any such false triggers and the speech recognition errors they cause. Even when these echoes do not cause automatic speech recognition errors, such echoes do interfere with the recognition of valid input, and it is desirable to reduce any such interference.
SUMMARY OF THE INVENTION
Briefly stated in accordance with one aspect of the invention the aforementioned shortcomings are addressed and advance in the art achieved by providing a method for preventing a false triggering error from an echo of an audible prompt in an interactive automatic speech recognition system which uses a plurality of hidden Markov models of the system's vocabulary with each of the hidden Markov models corresponding to a phrase that is at least one word long. The method includes the steps of receiving an input which has signals that correspond to a caller's speech and an echo of the audible prompt of the interactive automatic speech response system; using a hidden Markov model of the audible prompt's echo along with the plurality of hidden Markov models of the system's vocabulary in the automatic speech recognition system to match the input when an energy of the echo of the audible prompt is at most the same order of magnitude as the energy of the signals that correspond to the caller's speech instead of falsely triggering a match to one of the plurality of hidden Markov models of the vocabulary.
In accordance with another aspect of the invention, the aforementioned shortcomings are addressed and an advance in the art achieved by a speech recognition system for connection to a telephone network and telephone equipment of a caller that introduce an echo of a prompt played by the speech recognition system to reduce false triggering by the echo. The speech recognition system includes a network interface connecting the speech recognition system to a telephone line of the telephone network. When the network interface unit receives a call from caller via said telephone network, a play-prompt unit plays a prompt via the network interface unit to the caller to prompt a response from the caller. At the same time, also in response to the call and in response to the playing of the prompt, a network echo canceller partially cancels the echo of the prompt that is present in the call received by the network interface unit. The echo canceller is connected to an automatic speech recognizer and sends the input from the caller along with the partially cancelled echo of the prompt to the automatic speech recognizer. The automatic speech recognizer, which has a prompt echo model, prevents the automatic speech recognizer from falsely triggering on the partially cancelled echo and the automatic speech recognizer correctly recognizes the caller's response.


REFERENCES:
patent: 5708704 (1998-01-01), Fisher
patent: 5761090 (1998-06-01), Gross et al.
patent: 5956675 (1999-09-01), Setlur et al.
patent: 5978763 (1999-11-01), Bridges
patent: 6226612 (2001-05-01), Srenger et al.
“HMM-Based Echo Model for Noise Cancellation Avoiding the Problem of False Triggers”, Chengalvanyan et al, ICPACS 2000 Nov. 2000.*
R. Chengalvarayan, “On the use of normalized LPC error towards better large vocabulary speech recognition systems,” Proc. ICASSP, pp. 17-20, 1998.
E.J. Diethorn, “A subband noise-reduction method for enhancing speech in telephony and teleconferencing, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics,” 1997.
C. Mokbel and G. Chollet, “Automatic word recognition in cars,” IEEE Transactions on Speech and Audio Processing, vol. 3, No. 5, pp. 346-356, 1995.
A.R. Setlur and R.A. Sukkar, “Recognition-based word counting for reliable barge-in and early endpoint detection in continuous speech recognition,” Proc. ICSLP, pp. 823-826, 1998.
J.S. Lim and A.V. Oppenheim, “Enhancement and bandwidth compression of noisy speech,” Proc. IEEE, vol. 67, pp. 1586-1604, 1979.
R. Chengalvarayan, “Hybrid HMM architectures for robust-speech recognition and language identification,” Proc. Systemics, Cybernetics and Informatics, vol. 6, pp. 5-8, 2000.
C-M. Liu, C-C. Chiu and H-Y. Chang, “Design of Vocabulary-Independent Mandarin Keyword Spotters,” IEEE Transactions on Speech and Audio Processing, vol. 8, No. 4, pp. 483-487, 2000.
M.M. Sondhui and D.A. Berkley, “Silencing echos on the telephone network,” Proc. IEEE, vol. 68, pp. 948-963, 1980.
S.L. Gay and J. Benesty, “Acoustic signal processing for telecommunication,” Kluwer Academic Publishers, pp. 1-19, 2000.
S. Katagiri, B-H. Juang and C-H. Lee, “Pattern recognition using a family of design algorithms based upon the generalized probabilistic descent method,” Proc. IEEE, vol. 86, No. 11, 1998, pp. 2345-2373.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

HMM-based echo model for noise cancellation avoiding the... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with HMM-based echo model for noise cancellation avoiding the..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and HMM-based echo model for noise cancellation avoiding the... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3130908

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.