1992-09-01
1994-12-27
MacDonald, Allen R.
395 24, 395 26, G10L 900
Patent
active
053773020
ABSTRACT:
A pattern recognition system particularly useful for recognizing speech or handwriting. An input signal is first filtered by a filter bank having two stages where the outputs of the first stage is fed forward to the second stage of a significant number of filters and the output of the second stage is fed back to the first stage of a significant number of the filters. Such feedback enhances the signal-to-noise ratio and resembles the coupling between the different sections of the basilar membrane of the cochlear. The output of the filter bank is a two-dimensional frequency-time representation of the original signal. A second set of filters which takes as input two-dimensional signals, detects the presence of elementary tonotopic features such as the onset, rise, fall and frequency of any significant tones in a speech signal. A third set of filters detects any contrasts in the elementary features at various levels of resolution. After such filtering, a neural network is employed to learn patterns formed from the multi-resolution contrasts in the identified features so that the system recognizes symbols from an input signal that is continuous in time. In the case of speech, the system recognizes continuous speech in a speaker-independent manner, and is also tolerant of noise.
REFERENCES:
patent: 4422459 (1983-12-01), Simson
patent: 4536844 (1985-08-01), Lyon
patent: 4905285 (1990-02-01), Allen
patent: 5058179 (1991-10-01), Denker et al.
patent: 5067164 (1991-11-01), Denker et al.
patent: 5105468 (1992-04-01), Guyon et al.
Chong et al. "Classification and Regression Tree Neural Networks For Automatic Speech Recognition", IEEE, Jul. 1900, pp. 187-190.
Cun et al. "Handwritten Digit Recognition: Applications of Neural Network Chips and Automatic Learning", IEEE, Nov. 1989, pp. 41-46.
"Fundamentals of Hearing--An Introduction," by Yost et al., Holt, Rinehart and Winston, Second Edition, Chapter 6, pp. 52-70, New York, N.Y., 1977/12.
"Exploring the Space-Time Structure at the Output of a Cochlear Model," by Monderer, Columbia University, 1988, pp. 24-62.
"A Theory for Multiresolution Signal Decomposition: The Wavelet Representation," by S. Mallat, IEEE Trans. on Patt. Anal. and Mach. Intell., vol. 11, No. 7, pp. 674-693, Jul. 1989.
"An Analog Neural Network Processor and its Application to High-Speed Character Recognition," by Boser et al., IEEE, pp. I415-I420, Jul. 1991.
"Modularity and Scaling in Large Phonemic Neural Networks," by Waibel et al., IEEE Trans. on Acoustics, Speech, and Signal Proc., vol. 37, No. 12, pp. 1888-1989, Dec. 1989.
"The Meta-Pi Network: Connectionist Rapid Adaptation for High-Performance Multi-Speaker Phoneme Recognition," by Hampshire, II et al., IEEE, pp. 165-168, 1990.
"Frequency-Time-Shift-Invariant for Robust Continuous Speech Recognition," by Sawai, IEEE, pp. 45-98, 1991.
"Neocognition: A Hierarchical Neural Network Capable of Visual Pattern Recognition," by Fukishima, Neural Networks, vol. 1, pp. 119-130, 1988.
"Handwritten Digit Recognition with a Back-Propagation Network," by Le Cun et al., A Neural Information Processing Systems 2, Morgan Kaufman Publishers, San Mateo, Calif., pp. 396-404, 1984/01.
"Methods for Enhancing Neural Network Handwritten Character Recognition," by Garris et al., IEEE, pp. 1695-1700, 1991.
"Complete Discrete 2-D Gabor Transforms by Neural Networks for Image Analysis and Compression," by Daugman, IEEE Trans. on Acoustics, Speech, and Signal Processing, vol. 36, No. 7, pp. 1169-1179, Jul. 1988.
"Uncertainty Relation for Resolution in Space, Spatial Frequency, and Orientation Optimized by Two-Dimensional Visual Cortical Filters," by Daugman, J. Opt. Soc. Am., vol. 2, No. 7, pp. 1160-1169, Jul. 1985.
"Readings in Speech Recognition," by Waibel et al., Morgan Kaugmann Publishers Inc., San Mateo, Calif., 1990, pp. 1-5.
"Voice and Speech Processing," by Parsons, McGrall-Hill Book Company, 1987, pp. 291-292.
"Continuous-Time Temporal Back-Propagation," by Day et al, IEEE, pp. I195-I1100, 1991.
"Connectionist Approaches," Chapter 7, pp. 371-392, 1989, Massachusette Institute of Technology.
"Method for Computing Motion in a Two-Dimensional Cochlear Model," by Sondhi, J. Acoust. Soc. Am., vol. 63, No. 5, pp. 1468-1477, May 1978.
"A Computational Cochlear Nonlinear Preprocessing Model with Adaptive Q Circuits," by Hirahara et al., IEEE, pp. 496-499, 1989.
"A Computationally Efficient Basilar-Membrane Model," by Strube, pp. 207-214, Federal Republic of Germany, 1985, vol. 96.
"Implementation of Nonlinear Wave-Digital-Filter Cochlear Model," by Friedman, IEEE, pp. 397-400, 1990.
"Darpa Neural Network Study," Oct. 1987-Feb. 1988, pp. 111-156.
"Image Coding Using Lattice Vector Quantization of Wavelet Coefficients," by Antonini et al., IEEE, 1991, pp. 2273-2276.
"A Complete Parametrization of 2D Nonseparable Orthonormal Wavelets," by Basu et al., Proceedings IEEE-SP, Time-Frequency and Time-Scale Analysis, 1992, pp. 55-58.
"Character Recognition with Selective Attention," by Fukushima et al., IEEE, pp. I593-I598, 1991.
"TDNN-LR Continuous Speech Recognition System Using Adaptive Incremental TDNN Training," by Inui-Dani et al., IEEE, pp. 53-56, 1991.
Doerrler Michelle
MacDonald Allen R.
Monowave Corporation L.P.
LandOfFree
System for recognizing speech does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with System for recognizing speech, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System for recognizing speech will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-924916