Optimized local feature extraction for automatic speech...

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S249000, C704S236000

Reexamination Certificate

active

06513004

ABSTRACT:

BACKGROUND AND SUMMARY OF THE INVENTION
The present invention relates generally to speech recognition systems and more particularly to a wavelet-based system for extracting features for recognition that are optimized for different classes of sounds (e.g. fricatives, plosives, other consonants, vowels, and the like).
When analyzing a speech signal, the first step is to extract features which represent the useful information that characterizes the signal. Conventionally, this feature extraction process involves chopping the speech signal into overlapping windows of a predetermined frame size and then computing the Fast Fourier Transform (FFT) upon the signal window. A finite set of cepstral coefficients are then extracted by discarding higher order terms in the Fourier transform of the log spectrum. The resulting cepstral coefficients may then be used to construct speech models, typically Hidden Markov Models.
A significant disadvantage of conventional FFT analysis is its fixed time-frequency resolution. When analyzing speech, it would be desirable to be able to use a plurality of different time-frequency resolutions, to better spot the non-linearly distributed speech information in the time-frequency plane. In other words, it would be desirable if sharper time resolution could be provided for rapidly changing fricatives or other consonants while providing less time resolution for slower changing structures such as vowels. Unfortunately, current technology makes this difficult to achieve. While it is possible to construct and use in parallel a set of recognizers that are each designed for a particular speech feature, such solution carries a heavy computational burden.
The present invention employs wavelet technology that provides one analytical technique which covers a wide assortment of different classes of sounds. Using the wavelet technology of the invention, a single recognizer can be constructed and used in which the speech models have already been optimized for different classes of sounds through a unique feature extraction process. Thus the recognizer of the invention is optimized for different classes of sounds without increasing the complexity of the recognition analysis process.
For a more complete understanding of the invention, its objects and advantages, refer to the following specification and to the accompanying drawings.


REFERENCES:
patent: 4805219 (1989-02-01), Baker et al.
patent: 5321776 (1994-06-01), Shapiro
patent: 5715367 (1998-02-01), Gillick et al.
patent: 5852806 (1998-12-01), Johnston et al.
patent: 5898798 (1999-04-01), Bouchard et al.
patent: 5926791 (1999-07-01), Ogata et al.
patent: 6058205 (2000-05-01), Bahl et al.
patent: 6289131 (2001-09-01), Ishikawa
patent: 0 831 461 (1998-03-01), None
Long et al 1 (“Discriminant Wavelet Basis Construction for Speech Recognition”, 5th Int'l Conference on Spoken Language Processing, ©Nov. 1998).*
Long et al2 (“Wavelet Based Feature Extraction For Phoneme Recognition”, Fourth International Conference on Spoken Language Proceedings, Oct. 1996, pp. 264-267 vol. 1).*
Kryse et al.; “A New Noise-Robust Subband Front-End and its Comparison to PLP”; Keystone, Colorado; ASRU, 1999; Dec. 12-15, 1999. Entire Document.
Long et al.; “Discriminant Wavelet Basis Construction for Speech Recognition”; ICSLP, 1998; Nov. 30, 1998-Dec. 4, 1998; Sidney, Australia; pp. 1047-1049; entire document.
Chang et al.; “Speech Feature Extracted from Adaptive Wavelet for Speech Recognition”; Electronics Letters, IEE Stevenage, GB, vol. 34, No. 23; Nov. 12, 1998; pp. 2211-2213; entire document.
Long et al.; “Wavelet Based Feature Extraction for Phoneme Recognition”; Proceedings ICSLP, 1996; Fourth International Conference on Spoken Language Processing; Philadelphia, Pennsylvania; Oct. 3-6, 1996; pp. 264-267; vol. 1; entire document.
Tan et al.; “The Use of Wavelet Transforms in Phoneme Recognition”; Proceedings ICSLP, 1996; Fourth International Conference on Spoken Language Processing; Philadelphia, Pennsylvania; Oct. 3-6, 1996; pp. 2431-2434; vol. 4; entire document.
Erzin et al.; “Subband Decomposition Based Speech Recognition in the Presence of Car Noise”; Turkish Journal Electrical Engineering and Computer Sciencces, Elektrik; 1997; Sci. & Tech. Res. Council; Turkey, Turkey; vol. 5, No. 3; pp. 297-305.
C.J. Long and S. Datta, Department of Electrical and Electronic Engineering, Loughborough University, “Discriminant Wavelet Basis Construction for Speech Recognition”. Nov. '98.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Optimized local feature extraction for automatic speech... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Optimized local feature extraction for automatic speech..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Optimized local feature extraction for automatic speech... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3072148

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.