Method for additive and convolutional noise adaptation in...

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Method for additive and convolutional noise adaptation in... Method for additive and convolutional noise adaptation in...

: 2000-07-31
: 2004-02-10
: Abebe, Daniel (Department: 2655)
: Data processing: speech signal processing, linguistics, language
: Speech signal processing
: Recognition

: C704S244000
: Reexamination Certificate
: active
: 06691091
: ABSTRACT:

BACKGROUND AND SUMMARY OF THE INVENTION
The present invention relates generally to automatic speech recognition systems. More particularly, the invention relates to techniques for adapting the recognizer to perform better in the presence of noise.
Current automatic speech recognition systems perform reasonably well in laboratory conditions, but degrade rapidly when used in real world applications. One of the important factors influencing recognizer performance in real world applications is the presence of environmental noise that corrupts the speech signal. A number of methods, such as spectral subtraction or parallel model combination, have been developed to address the noise problem. However, these solutions are either too limited or too computationally expensive.
Recently, a Jacobian adaptation method has been proposed to deal with additive noise, where the noise changes from noise A to noise B. For example, U.S. Pat. No. 6,026,359 to Yamaguchi describes such a scheme for model adaptation in pattern recognition, based on storing Jacobian matrices of a Taylor expansion that expresses model parameters. However, for this method to perform well it is necessary to have noise A and noise B close to one another in terms of character and level. For example, the Jacobian adaptation technique is likely to work well where noise A is measured within the passenger compartment of a given vehicle travelling on a smooth road at 30 miles an hour, and where Noise B is of a similar character, such as the noise measured inside the same vehicle on the same road travelling at 45 miles per hour.
The known Jacobian adaptation technique begins to fail where noise A and B lie farther apart from one another, such as where noise A is measured inside the vehicle described above at 30 miles per hour and noise B is measured in the vehicle with windows down or at 60 miles per hour.
This shortcoming of the proposed Jacobian noise adaptation method limits its usefulness in many practical applications because it is often difficult to anticipate at training time the noise that may be present at testing time (when the system is in use). Also, improvements in Jacobian noise adaptation techniques are limited in many applications because the computational expense (processing time and/or memory requirements) needed makes them impractical.
The present invention addresses the foregoing shortcoming. Instead of using Jacobian matrices, the invention uses a transformed matrices which resembles the form of a Jacobian matrix but comprises different values. The transformed matrices compensate for the fact that the respective noises at training time and at recognition time may be far apart. The presently preferred embodiment of the inventive method effects a linear or non-linear transformation of the Jacobian matrices using an &agr;-adaptation parameter to develop the transformed matrices. The transformation process can alternatively be effected through other linear or non-linear transformation means, such as using a neural network or other artificial intelligence mechanism. To speed computation, the resulting transformed matrices may be reduced through a dimensionality reduction technique such as principal component analysis.
Another concern relates to compensation of convolutional noise. Specifically, convolutional noise can be distinguished from the above discussed additive noise in that convolutional noise results from the speech channel. For example, changes in the distance from the speaker to the microphone, microphone imperfections, and even the telephone line over which the signal is transmitted all contribute to convolutional noise. Additive noise, on the other hand, typically results from the environment in which the speaker is speaking.
An important characteristic of convolutional noise is that it is multiplicative with the speech signal in the spectral domain, whereas additive noise is additive in the spectral domain. This causes particular difficulties with respect to noise compensation. In fact, most conventional approaches deal either with convolutional noise or additive noise, but not both.
The above advantages of &agr;-Jacobian (and Jacobian) adaptation can be applied to joint compensation of additive and convolutional noise. The present invention provides a method and system for performing noise adaptation primarily in the cepstral domain. This is significant because convolutional noise is additive in this domain. The method includes the step of generating a reference model based on a training speech signal. The reference model is then compensated for both additive and convolutional noise in the cepstral domain.
One approach to compensating the reference model for convolutional noise includes the step of estimating a convolutional bias between the training speech signal and a target speech signal. The estimated convolutional bias is then transformed with a channel adaptation matrix. The method further provides for adding the transformed convolutional bias to the reference model in the cepstral domain. Thus, the present invention transforms and adapts the reference models as opposed to the signals themselves. In general, the compensation for additive and convolutional noise is done on the means of the Gaussian distributions.
In another aspect of the invention, a noise adaptation system for a speech recognition system has a reference model generator, an additive noise module and a convolutional noise module. The reference model generator generates a reference model based on a training speech signal. The additive noise module is coupled to the reference model generator and compensates the reference model for additive noise in the cepstral domain. The convolutional noise module is also coupled to the reference model generator and compensates the reference model for convolutional noise in the cepstral domain.
For a more complete understanding of the invention, its objects and advantages, refer to the following specification and to the accompanying drawings.

REFERENCES:
patent: 6026359 (2000-02-01), Yamaguchi et al.
S. Sagayama, Y. Yamaguchi and S. Takahashi, “Jacobian Adaptation of Noisy Speech Models,” NTT Human Interface Laboratories, Musashino-shi, Tokyo 180 Japan.
Shigeki Sagayama, Yoshikazu Yamaguchi, Satoshi Takahashi, and Jun-ichi Takahashi, “Jacobian Approach to Fast Acoustic Model Adaptation,” NTT Human Interface Laboratories, 1-1, Hikari-no-Oka, Yokosuka-shi, Kanagawa, 239 Japan.
M.J.F. Gales, “Predictive Model-Based Compensation Schemes for Robust Speech Recognition,” Elsevier, Speech Communication 25 (1998) 49-74.
Pedro J. Moreno, Bhiksha Raj and Richard M. Stern, “A Vector Taylor Series Approach for Environment-Independent Speech Recognition,” 1995 IEEE, pp. 733-736.
Cerisara et al.; “Environmental Adaptation Based on First Order Approximation”; Panasonic Speech Technology Laboratory; Santa Barbara, California.
Sagayama et al.; “Jacobian Approach to Fast Acoustic Model Adaptation”; NTT Human Interface Laboratories; IEEE, 1997; pp. 835-838.
Leggetter et al.; “Flexible Speaker Adaptation Using Maximum Likelihood Linear Regression”; Cambridge University Engineering Department; Cambridge, United Kingdom.
M.J.F. Gales and S. J. Young, “Robust Speech Recognition in additive and Convolutional Noise Using Parallel Model Combination”, Computer Speech and Language (1995) 9, 289-307.
Y.H. Chang, Y.J. Chung, S. U. Park, “Improved Model Parameter Compensation Methods for Noise-Robust Speech Recognition”, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing.
M. Afify, Y. Gong, J.P. Haton, “A General Joint Additive adn Convolutive Bias Compensation Approach Applied to Noisy Lombard Speech Recognition”, IEEE Transactions on Speech and Audio Processing, vol. 6, No. 6. Nov. 1998.

Affiliated with

Boman Robert

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Cerisara Christophe

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Junqua Jean-Claude

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Rigazio Luca

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Abebe Daniel

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Harness Dickey & Pierce PLC

Law Firm

[ 0.00 ] – not rated yet Voters 0 Comments 0

Matsushita Electric - Industrial Co., Ltd.

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method for additive and convolutional noise adaptation in... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method for additive and convolutional noise adaptation in..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for additive and convolutional noise adaptation in... will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-3293305

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure