Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition
Reexamination Certificate
2002-03-15
2004-02-03
Chawan, Vijay (Department: 2654)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Recognition
C704S233000, C704S226000, C704S219000, C704S216000, C381S094300
Reexamination Certificate
active
06687672
ABSTRACT:
BACKGROUND OF THE INVENTION
The present invention relates to methods and apparatus for processing speech signals, and more particularly for methods and apparatus for removing channel distortion in speech systems such as speech and speaker recognition systems.
Cepstral mean normalization (CMN) is an effective technique for removing communication channel distortion in automatic speaker recognition systems. To work effectively, the speech processing windows in CMN systems must be very long to preserve phonetic information. Unfortunately, when dealing with non-stationary channels, it would be preferable to use smaller windows that cannot be dealt with as effectively in CMN systems. Furthermore, CMN techniques are based on an assumption that the speech mean does not carry phonetic information or is constant during a processing window. When short windows are utilized, however, the speech mean may carry significant phonetic information.
The problem of estimating a communication channel affecting a speech signal falls into a category known as blind system identification. When only one version of the speech signal is available (i.e., the “single microphone” case), the estimation problem has no general solution. Oversampling may be used to obtain the information necessary to estimate the channel, but if only one version of the signal is available and no oversampling is possible, it is not possible to solve each particular instance of the problem without making assumptions about the signal source. For example, it is not possible to perform channel estimation for telephone speech recognition, when the recognizer does not have access to the digitizer, without making assumptions about the signal source.
SUMMARY OF THE INVENTION
One configuration of the present invention therefore provides a method for blind channel estimation of a speech signal corrupted by a communication channel. The method includes converting a noisy speech signal into either a cepstral representation or a log-spectral representation; estimating a temporal correlation of the representation of the noisy speech signal; determining an average of the noisy speech signal; constructing and solving, subject to a minimization constraint, a system of linear equations utilizing a correlation structure of a clean speech training signal, the correlation of the representation of the noisy speech signal, and the average of the noisy speech signal; and selecting a sign of the solution of the system of linear equations to estimate an average clean speech signal over a processing window.
Another configuration of the present invention provides an apparatus for blind channel estimation of a speech signal corrupted by a communication channel. The apparatus is configured to convert a noisy speech signal into either a cepstral representation or a log-spectral representation; estimate a temporal correlation of the representation of the noisy speech signal; determine an average of the noisy speech signal; construct and solve, subject to a minimization constraint, a system of linear equations utilizing a correlation structure of a clean speech training signal, the correlation of the representation of the noisy speech signal, and the average of the noisy speech signal; and select a sign of the solution of the system of linear equations to estimate an average clean speech signal over a processing window.
Yet another configuration of the present invention provides a machine readable medium or media having recorded thereon instructions configured to instruct an apparatus including at least one of a programmable processor and a digital signal processor to: convert a noisy speech signal into a cepstral representation or a log-spectral representation; estimate a temporal correlation of the representation of the noisy speech signal; determine an average of the noisy speech signal; construct and solve, subject to a minimization constraint, a system of linear equations utilizing a correlation structure of a clean speech training signal, the correlation of the representation of the noisy speech signal, and the average of the noisy speech signal; and select a sign of the solution of the system of linear equations to estimate an average clean speech signal over a processing window.
Configurations of the present invention provide effective and efficient estimations of speech communication channels without removal of phonetic information.
Further areas of applicability of the present invention will become apparent from the detailed description provided hereinafter. It should be understood that the detailed description and specific examples, while indicating the preferred embodiment of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.
REFERENCES:
patent: 4897878 (1990-01-01), Boll et al.
patent: 5487129 (1996-01-01), Paiss et al.
patent: 5625749 (1997-04-01), Goldenthal et al.
patent: 5839103 (1998-11-01), Mammone et al.
patent: 5864810 (1999-01-01), Digalakis et al.
patent: 5913192 (1999-06-01), Parthasarathy et al.
patent: 6278970 (2001-08-01), Milner
patent: 6430528 (2002-08-01), Jourjine et al.
patent: 6496795 (2002-12-01), Malvar
patent: WO 99/59136 (1999-11-01), None
Tong et al., (“Blind Channel Estimation by least squares smoothing”, Proceedings of the 1998 IEEE International Conference o Acoustics, Speech, and Signal Processing, 1998. ICASSP'98, May 1998, vol. 4, pp. 2121-2124).*
“Blind Channel Estimation By Least Squares Smoothing”, Lang Tong and Qing Zhao, Acoustics, Speech, and Signal Processing, ICASSP '98, Proceedings of the 1998 IEEE International Conference on May 12, 1998 to May 15, 1998, Seatle, Washington, vol. 4, 0-7803-4428-6/98, pp. 2121-2124.
“Pole-Filtered Cepstral Subtraction”, D. Naik, 1995 International Conference on Acoustics, Speech, and Signal Processing, May, 1995, vol. 1, pp. 157-160, particularly 160.
International Search Report for International Application No. PCT/US99/10038, Jun. 16, 1999, by Martin Lerner.
Junqua Jean-Claude
Nguyen Patrick
Rigazio Luca
Souilmi Younes
LandOfFree
Methods and apparatus for blind channel estimation based... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Methods and apparatus for blind channel estimation based..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Methods and apparatus for blind channel estimation based... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3328019