Speech detection device

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

704248, 704253, G10L 506, G10L 900

Patent

active

058262300

DESCRIPTION:

BRIEF SUMMARY
TECHNICAL FIELD

The invention generally relates to a device for the detection of the start and end of a segment containing speech within an input audio signal which contains both speech segments and nonspeech noise or background segments.


BACKGROUND ART

Detection of speech in real time is a necessary component for many devices, including but not limited to voice activated tape recorders, answering machines, automatic speech recognizers, and processors for removing speech from music. Many of these applications have noise inseparably mixed with speech. Detection of speech requires a more sophisticated speech detection capability than provided by conventional devices that simply detect when energy level rises above or falls below preset threshold.
In the field of automatic speech recognition, the speech detection component is most critical. In practice, more speech recognition errors arise from errors in speech detection than from errors in pattern matching, which is commonly used to determine the content of the speech signal. One proposed solution is to use a word spotting technique, in which the recognizer is always listening for a particular word. However, if word spotting is not preceded by speech detection, the overall error rate can be high.
Many speech detection devices are based on a certain parameter of the input, such as energy, pitch, and zero crossings. The performance of the speech detector depends heavily on the robustness of that parameter to background noise. For real time speech detection, the parameters must be quickly extracted from the signal.


DISCLOSURE OF INVENTION

One of the objects of the present invention is to provide a device for the detection of speech which is capable of operation at a speed fast enough to keep up with the arrival of the input, i.e., real time.
Another object of the present invention is to provide a device for the detection of speech that can be implemented with a conventional digital signal processing circuit board.
Another object of the present invention is to provide a device for the detection of speech which is effective despite various types of noise mixed with the speech.
Another object of the present invention is to provide a speech detection device for various applications, including but not limited to: isolated word automatic speech recognizers, continuous speech recognizers (to detect pauses between phrases of sentences), voice controlled tape recorders, answering machines, and the processing of voice embedded in a recording with background noise or music.
These and other objects of the invention are achieved by the provision of a device for detecting speech in an input signal which includes means for determining a value representative of the smoothed frequency band limited energy within the signal, means for determining a variance of the value representative of the smoothed frequency band limited energy of the signal, and means for determining the beginning and ending points of speech within the signal based on the variance of the smoothed frequency band limited energy and the history of the band limited energy.
The invention exploits the variance in the smoothed frequency band limited energy and the history of the smoothed frequency band limited energy to detect the beginning and end of speech within an input speech signal. Variance of the smoothed frequency band limited energy is employed based on the observation that foreground speech occurring in a difficult background, such as a lead vocalist against a background of music, yields a noticeable fluctuation of the energy level above a "noise floor" of relatively low fluctuation. This effect occurs although the level of the background may be high. Variance quantifies that fluctuation of energy.
In accordance with the preferred embodiment, the device calculates smoothed frequency band limited energy using a Hamming window and a Fourier transform. The variance is calculated as a function of time from smoothed frequency band limited energy values stored in a shift register. To determine the beginni

REFERENCES:
patent: 4441203 (1984-04-01), Fleming
patent: 5579431 (1996-11-01), Reaves
patent: 5617508 (1997-04-01), Reaves

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Speech detection device does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Speech detection device, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Speech detection device will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-259824

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.