Speech detection with noise suppression based on principal...

Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Speech detection with noise suppression based on principal... Speech detection with noise suppression based on principal...

: 1998-10-21
: 2001-05-08
: Korzuch, William R. (Department: 2641)
: Data processing: speech signal processing, linguistics, language
: Speech signal processing
: For storage or transmission

: C704S204000, C704S233000
: Reexamination Certificate
: active
: 06230122
: ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates generally to electronic speech detection systems, and relates more particularly to a method for suppressing background noise in a speech detection system.
2. Description of the Background Art
Implementing an effective and efficient method for system users to interface with electronic devices is a significant consideration of system designers and manufacturers. Human speech detection is one promising technique that allows a system user to effectively communicate with selected electronic devices, such as digital computer systems. Speech generally consists of one or more spoken utterances which each may include a single word or a series of closely-spaced words forming a phrase or a sentence. In practice, speech detection systems typically determine the endpoints (the beginning and ending points) of a spoken utterance to accurately identify the specific sound data intended for analysis.
Conditions with significant ambient background-noise levels present additional difficulties when implementing a speech detection system. Examples of such noisy conditions may include speech recognition in automobiles or in certain manufacturing facilities. In such user applications, in order to accurately analyze a particular utterance, a speech recognition system may be required to selectively differentiate between a spoken utterance and the ambient background noise.
Referring now to FIG.
1
(
a
), an exemplary waveform diagram for one embodiment of noisy speech
112
is shown. In addition, FIG.
1
(
b
) depicts an exemplary waveform diagram for one embodiment of speech
114
without noise. Similarly, FIG.
1
(
c
) shows an exemplary waveform diagram for one embodiment of noise
116
without speech
114
. In practice, noisy speech
112
of FIG.
1
(
a
) is therefore typically comprised of several components, including speech
114
of FIG. (
1
(
b
) and noise
116
of FIG.
1
(
c
). In FIGS.
1
(
a
),
1
(
b
), and
1
(
c
), waveforms
112
,
114
, and
116
are presented for purposes of illustration only. The present invention may readily function and incorporate various other embodiments of noisy speech
112
, speech
114
, and noise
116
.
An important measurement in speech detection systems is the signal-to-noise ratio (SNR) which specifies the amount of noise present in relation to a given signal. For example, the SNR of noisy speech
112
in FIG.
1
(
a
) may be expressed as the ratio of noisy speech
112
divided by noise
116
of FIG.
1
(
c
). Many speech detection systems tend to function unreliably in conditions of high background noise when the SNR drops below an acceptable level. For example, if the SNR of a given speech detection system drops below a certain value (for example, 0 decibels), then the accuracy of the speech detection function may become significantly degraded.
Various methods have been proposed for speech enhancement and noise suppression. A spectral subtraction method, due to its simplicity, has been widely used for speech enhancement. Another known method for speech enhancement is Wiener filtering. Inverse filtering based on all-pole models has also been reported as a suitable method for noise suppression. However, the foregoing methods are not entirely satisfactory in certain relevant applications, and thus they may not perform adequately in particular implementations. From the foregoing discussion, it therefore becomes apparent that suppressing ambient background noise to improve the signal-to-noise ratio in a speech detection system is a significant consideration of system designers and manufacturers of speech detection systems.
SUMMARY OF THE INVENTION
In accordance with the present invention, a method is disclosed for suppressing background noise in a speech detection system. In one embodiment, a feature extractor in a speech detector initially receives noisy speech data that is preferably generated by a sound sensor, an amplifier and an analog-to-digital converter. In the preferred embodiment, the speech detector processes the noisy speech data in a series of individual data units called “windows” that each include sub-units called “frames”.
The feature extractor responsively filters the received noisy speech into a predetermined number of frequency sub-bands or channels using a filter bank to thereby generate filtered channel energy to a noise suppressor. The filtered channel energy is therefore preferably comprised of a series of discrete channels which the noise suppressor operates on concurrently.
Next, a subspace module in the noise suppressor preferably performs a Karhunen-Loeve transformation (KLT) to generate a KLT subspace that is based on the background noise from the filtered channel energy received from the filter bank. A projection module in the noise suppressor then projects the filtered channel energy onto the KLT subspace previously created by the subspace module to generate projected channel energy.
Then, a weighting module in the noise suppressor advantageously calculates individual weighting values for each channel of the projected channel energy. In a first embodiment, the weighting module calculates weighting values whose various channel values are directly proportional to the signal-to-noise ratio (SNR) for the corresponding channel. For example, the weighting values may be equal to the corresponding channel's SNR raised to a selectable exponential power.
In a second embodiment, in order to achieve an implementation of reduced complexity and computational requirements, the weighting module calculates the individual weighting values as being equal to the reciprocal of the background noise for the corresponding channel. The weighting module therefore generates a total noise-suppressed channel energy that is the summation of each channel's projected channel energy value multiplied by that channel's calculated weighting value.
An endpoint detector then receives the noise-suppressed channel energy, and responsively detects corresponding speech endpoints. Finally, a recognizer receives the speech endpoints from the endpoint detector, and also receives feature vectors from the feature extractor, and responsively generates a recognition result using the endpoints and the feature vectors between the endpoints. The present invention thus efficiently and effectively suppressed background noise in a speech detection system.

REFERENCES:
patent: 4592085 (1986-05-01), Watari et al.
patent: 4630304 (1986-12-01), Borth et al.
patent: 4910716 (1990-03-01), Kirlin et al.
patent: 4951266 (1990-08-01), Hsu et al.
patent: 5003601 (1991-03-01), Watari et al.
patent: 5093899 (1992-03-01), Hiraiwa
patent: 5212764 (1993-05-01), Ariyoshi
patent: 5301257 (1994-04-01), Tani
patent: 5485524 (1996-01-01), Kuusama et al.
patent: 5513298 (1996-04-01), Stanford et al.
patent: 5615296 (1997-03-01), Stanford et al.
patent: 5699480 (1997-12-01), Martin
patent: 5715367 (1998-02-01), Gillick et al.
patent: 5806025 (1998-09-01), Vis et al.
Haykin, Simon, “Neural Networks,” 1994, pp. 363-370.
Ephraim et al., A Signal Subspace Approach For Speech Enhancement, Jul. 1995, pp. 251-266 IEEE Trans. Speech and Audio Proc., vol. 3 Iss.4.
Lee et al., Image Enhancement Based On Signal Subspace Approach, Aug. 1999, pp 1129-1134, IEEE Trans. Image Proc., vol. 8, Iss.8.
Ephraim et al., A Spectrally-Based Signal Subspace Approach For Speech Enhancement, May 1995, pp 804-807, 1995 Int. Conf. Acoust. Speech Sig. Proc., ICASSP-95, vol. 1.

Affiliated with

Amador-Hernandez Mariscela

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Tanaka Miyuki

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Wu Duanpei

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Koerner Gregory J.

Attorney

[ 0.00 ] – not rated yet Voters 0 Comments 0

Korzuch William R.

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Simon & Koerner LLP

Law Firm

[ 0.00 ] – not rated yet Voters 0 Comments 0

Sony Corporation

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

Storm Donald L.

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Speech detection with noise suppression based on principal... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Speech detection with noise suppression based on principal..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Speech detection with noise suppression based on principal... will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-2502060

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure