Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition
Reexamination Certificate
1998-09-30
2004-02-10
Chawan, Vijay (Department: 2655)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Recognition
C704S231000, C704S236000
Reexamination Certificate
active
06691087
ABSTRACT:
The present invention generally relates to an apparatus and a concomitant method for processing a signal having two or more signal components. More particularly, the present invention detects the presence of a desired signal component, e.g., a speech component, in a signal using a decision function that is adaptively updated.
BACKGROUND OF THE DISCLOSURE
In real world environments, many observed signals are typically composites of a plurality of signal components. For example, if one records an audio signal within a moving vehicle, the measured audio signal may comprise a plurality of signal components, such as audio signals attributed to the tires rolling on the surface of the road, the sound of wind, sounds from other vehicles, speech signals of people within the vehicle and the like. Furthermore, the measured audio signal is non-stationary, since the signal components vary in time as the vehicle is traveling.
In such real world environments, it is often advantageous to detect the presence of a desired signal component, e.g., a speech component in an audio signal. Speech detection has many practical applications, including but not limited to, voice or command recognition applications. However, speech detection methods are usually based on discriminating the total or component-wise signal power. For example, the component-wise signal powers are combined into a predefined ad-hoc decision function, which then generates a decision whether the current frame contains speech or not.
However, there are at least several difficulties associated with ad-hoc decision functions. First, ad-hoc decision functions often require the adjustment of a threshold which often is suboptimal for time-varying Signal-to-Noise Ratio (SNR). Second, it has been noted that many ad-hoc decision functions tend to falsely detect speech during long non-speech periods.
Therefore, a need exists in the art for detecting the presence of a desired signal component, e.g., a speech component, in a non-stationary signal using a decision function that is adaptively updated.
SUMMARY OF THE INVENTION
The present signal processing system detects the presence of a desired signal component by applying a probabilistic description to the classification and tracking of the various signal components (e.g., desired versus non-desired signal components) in an input signal. Namely, an N mixture model (e.g., a dual mixture where N=2) is used, where the model densities capture N signal components, e.g., two signal components having speech and non-speech features that are observed in the past, e.g., past audio frames. Classification of a new frame is then simply a matter of computing the likelihood that the new frame corresponds to either class. In turn, an optimal threshold can be adaptively generated and updated.
REFERENCES:
patent: 4837831 (1989-06-01), Gillick et al.
patent: 5598507 (1997-01-01), Kimber et al.
patent: 5799276 (1998-08-01), Komissarchik et al.
patent: 5839105 (1998-11-01), Ostendorf et al.
patent: 5884261 (1999-03-01), de Souza et al.
patent: 5946656 (1999-08-01), Rahim et al.
“Sequential Algorithms for Parameter Estimation Based on the Kullback-Leibler Information Measure”, Weinstein et al., IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 38, No. 9, Sep. 1990, pp. 1652-1654.
“Frequency Domain Noise Suppression Approaches in Mobile Telephone Systems”, J. Yang, IEEE 1993, pp. II-363-II-366.
“Robust Speech Pulse Detection Using Adaptive Noise Modelling”, N. B. Yoma et al., Electronics Letters, Jul. 18, 1996, vol. 32, No. 15, pp. 1350-1352.
“Perceptual Wavelet-Representation of Speech Signals and its Application to Speech Enhancement”, I. Pinter, Computer Speech and Language (1996) 10, pp. 1-22.
“A New View of the EM Algorithm That Justifies Incremental and Other Variants”, R. M. Neal and G. E. Hinton, pp. 1-11. Feb. 12, 1993.
“The Study of Speech/Pause Detectors for Speech Enhancement Methods”, P. Sovka and P. Pollak, EUROSPEECH'95.
“Cepstral Speech/Pause Detectors,” P. Pollak et al., IEEE Workshop on Nonlinear Signal and Image Processing, 1995.
de Vries Aalbert
Parra Lucas
Burke William J.
Chawan Vijay
Opsasnick Michael N.
Sarnoff Corporation
LandOfFree
Method and apparatus for adaptive speech detection by... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for adaptive speech detection by..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for adaptive speech detection by... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3287161