Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition
Reexamination Certificate
1999-11-23
2004-06-15
McFadden, Susan (Department: 2655)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Recognition
C704S234000, C704S226000, C704S244000
Reexamination Certificate
active
06751588
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates generally to electronic speech recognition systems, and relates more particularly to a method for performing microphone conversions in a speech recognition system.
2. Description of the Background Art
Implementing an effective and efficient method for system users to interface with electronic devices is a significant consideration of system designers and manufacturers. Automatic speech recognition is one promising technique that allows a system user to effectively communicate with selected electronic devices, such as digital computer systems. Speech typically consists of one or more spoken utterances which each may include a single word or a series of closely-spaced words forming a phrase or a sentence.
An automatic speech recognizer typically builds a comparison database for performing speech recognition when a potential user “trains” the recognizer by providing a set of sample speech. Speech recognizers tend to significantly degrade in performance when a mismatch exists between training conditions and actual operating conditions. Such a mismatch may result from various types of acoustic distortion. One source that may create acoustic distortion is the presence of convolutive distortions due to the use of various different microphones during training process and the actual speech recognition process.
Referring now to FIG.
1
(
a
), an exemplary waveform diagram for one embodiment of speech
112
recorded an original training microphone is shown. In addition, FIG.
1
(
b
) depicts an exemplary waveform diagram for one embodiment of speech
114
recorded with a final microphone used in the actual speech recognition process. In practice, speech
112
of FIG.
1
(
a
) and speech
114
of FIG. (
1
(
b
) typically exhibit mismatched characteristics, even when recording an identical utterance. This mismatch typically results in significantly degraded performance of a speech recognizer. In FIGS.
1
(
a
) and
1
(
b
), waveforms
112
and
114
are presented for purposes of illustration only. A speech recognition process may readily incorporate various other embodiments of speech waveforms.
From the foregoing discussion, it therefore becomes apparent that compensating for various different microphones a significant consideration of designers and manufacturers of contemporary speech recognition systems.
SUMMARY OF THE INVENTION
In accordance with the present invention, a method is disclosed for performing microphone conversions in a speech recognition system. In one embodiment of the present invention, initially, a speech module preferably captures the same input signal with an original microphone, and also simultaneously captures the same input signal with a final target microphone. In certain embodiments, the foregoing two recorded versions of the same input signal may be stored as speech data in a memory device.
The speech module preferably then accesses the recorded input signals using a feature extractor that separately processes the recorded input signals as recorded by the original microphone, and also as recorded by the final target microphone. A characterization module may preferably then perform a characterization process by analyzing the two versions of the same recorded input signal, and then responsively generating characterization values corresponding to the original microphone and the final microphone.
In certain embodiments, the characterization module may perform the foregoing characterization process by accessing the recorded input data as it is processed by the feature extractor in a frequency-energy domain following a fast Fourier transform procedure. In certain other embodiments, the characterization module may perform the foregoing characterization process further downstream by accessing the recorded input data as it is processed by the feature extractor in a cepstral domain following a frequency cosine transform process.
The speech module preferably then utilizes the feature extractor to process an original training database that was initially recorded using the original microphone. Next, a conversion module preferably may convert the original training database into a final training database by utilizing the characterization values that were previously generated by the characterization module.
A recognizer training program may then utilize the final training database to train a recognizer in the speech module. Finally, the speech module may advantageously utilize the trained recognizer in a speech recognition system that utilizes the final microphone to capture input data for optimized speech recognition, in accordance with the present invention. The present invention thus efficiently and effectively performs microphone conversions in a speech recognition system.
REFERENCES:
patent: 5528731 (1996-06-01), Sachs et al.
patent: 6173258 (2001-01-01), Menendez-Pidal et al.
patent: 6233556 (2001-05-01), Teunen et al.
patent: 6327565 (2001-12-01), Kuhn et al.
Alexander D. Poularikas and Samuel Seely, Signals and Systems, PWS Engineering, Boston, p. 177, 306, and 475.*
John R. Deller, Jr., John G. Proakis, and John H. L. Hansen, Discrete-Time Processing of Speech Signals, Prentice-Hall, 1993, p. 360-361.*
Neumayer, Leonardo G.; Digalakis, Vassilios V.; Weintraub, Mitchell, “Training Issues and Channel Equalization Techniques for the Construction of Telephone Acoustic Models Using a High-Quality Speech Corpus,” IEEE Transactions on Speech and Audio Processing, vol. 2, No. 4, Oct. 1994, pp. 590-597.
Menendez-Pidal Xavier
Tanaka Miyuki
Wu Duanpei
Koerner Gregory J.
McFadden Susan
Simon & Koerner LLP
LandOfFree
Method for performing microphone conversions in a speech... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method for performing microphone conversions in a speech..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for performing microphone conversions in a speech... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3306552