Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission
Reexamination Certificate
1999-03-08
2001-08-07
Dorvil, Richemond (Department: 2641)
Data processing: speech signal processing, linguistics, language
Speech signal processing
For storage or transmission
C704S219000, C704S220000, C704S223000
Reexamination Certificate
active
06272460
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates generally to electronic speech recognition systems and relates more particularly to a method for implementing a speech verification system for use in a noisy environment.
2. Description of the Background Art
Implementing an effective and efficient method for system users to interface with electronic devices is a significant consideration of system designers and manufacturers. Voice-controlled operation of electronic devices is a desirable interface for many system users. For example, voice-controlled operation allows a user to perform other tasks simultaneously. For instance, a person may operate a vehicle and operate an electronic organizer by voice control at the same time. Hands-free operation of electronic systems may also be desirable for users who have physical limitations or other special requirements.
Hands-free operation of electronic devices may be implemented by various speech-activated electronic systems. Speech-activated electronic systems thus advantageously allow users to interface with electronic devices in situations where it would be inconvenient or potentially hazardous to utilize a traditional input device.
Speech-activated electronic systems may be used in a variety of noisy environments, for instance industrial facilities, manufacturing facilities, commercial vehicles, and passenger vehicles. A significant amount of noise in an environment may interfere with and degrade the performance and effectiveness of speech-activated systems. System designers and manufacturers typically seek to develop speech-activated systems that provide reliable performance in noisy environments.
In a noisy environment, sound energy detected by a speech-activated system may contain speech and a significant amount of noise. In such an environment, the speech may be masked by the noise and be undetected. This result is unacceptable for reliable performance of the speech-activated system.
Alternatively, sound energy detected by the speech-activated system may contain only noise. The noise may be of such a character that the speech-activated system identifies the noise as speech. This result reduces the effectiveness of the speech-activated system, and is also unacceptable for reliable performance. Verifying that a detected signal is actually speech increases the effectiveness and reliability of speech-activated systems.
Therefore, for all the foregoing reasons, implementing an effective and efficient method for a system user to interface with electronic devices remains a significant consideration of system designers and manufacturers.
SUMMARY OF THE INVENTION
In accordance with the present invention, a method is disclosed for implementing a speech verification system for use in a noisy environment. In one embodiment, the invention includes the steps of generating a confidence index for an utterance using a speech verifier, and controlling the speech verifier with a processor. The speech verifier includes a noise suppressor, a pitch detector, and a confidence determiner.
The utterance preferably includes frames of sound energy, and a pre-processor generates a frequency spectrum for each frame n in the utterance. The noise suppressor suppresses noise in the frequency spectrum for each frame n in the utterance. Each frame n has a corresponding frame set that includes frame n and a selected number of previous frames. The noise suppressor suppresses noise in the frequency spectrum for each frame by summing together the spectra of frames in the corresponding frame set to generate a spectral sum. Spectra of frames in a frame set are similar, but not identical. Prior to generating the spectral sum, the noise suppressor aligns the frequencies of each spectrum in the frame set with the spectrum of a base frame of the frame set.
The pitch detector applies a spectral comb window to each spectral sum to produce correlation values for each frame in the utterance. The frequency that corresponds to the maximum correlation value is selected as the optimum frequency index. The pitch detector also applies an alternate spectral comb window to each spectral sum to produce alternate correlation values for each frame in the utterance. The frequency that corresponds to the maximum alternate correlation value is selected as the optimum alternate frequency index.
The confidence determiner evaluates the correlation values to produce a frame confidence measure for each frame in the utterance. First, confidence determiner calculates a harmonic index for each frame. The harmonic index indicates whether the spectral sum for each frame contains peaks at more than one frequency. Next, the confidence determiner evaluates a maximum peak of the correlation values for each frame to determine a frame confidence measure for each frame.
The confidence determiner then uses the frame confidence measures to generate the confidence index for the utterance, which indicates whether the utterance is speech or not speech. The present invention thus efficiently and effectively implements a speech verification system for use in a noisy environment.
REFERENCES:
patent: 4737976 (1988-04-01), Borth et al.
patent: 5428707 (1995-06-01), Gould et al.
patent: 5675704 (1997-10-01), Juang et al.
patent: 5778342 (1998-07-01), Erell et al.
patent: 6023674 (2000-02-01), Mekuria
patent: 6052659 (2000-04-01), Mermelstein
patent: 6070135 (2000-05-01), Kim et al.
patent: 6070137 (2000-05-01), Bloebaum et al.
patent: 6084967 (2000-07-01), Kennedy et al.
Martin, Philippe, “Comparison of Pitch Detection By Cepstrum and Spectral Comb Analysis,”Proceedings of ICASSP,1982, pp. 180-183.
Tucker, R., “Voice Activity Detection Using A Periodicity Measure,”IEEE Proceedings-1, vol. 139, No. 4, Aug. 1992, pp. 377-380.
Hermes, Dik J., “Pitch Analysis,”Visual Representations of Speech Signals,1993, pp. 3-15.
Olorenshaw Lex
Tanaka Miyuki
Wu Duanpei
Dorvil Richemond
Koerner Gregory J.
McFadden Susan
Simon & Koerner LLP
Sony Corporation
LandOfFree
Method for implementing a speech verification system for use... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method for implementing a speech verification system for use..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for implementing a speech verification system for use... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2488520