Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition
Reexamination Certificate
2001-03-16
2001-12-11
Korzuch, William (Department: 2641)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Recognition
C704S240000
Reexamination Certificate
active
06330536
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of Invention
This invention relates to determining the identity of a speaker as one of a small group, based on a sentence-length password utterance.
2. Description of Related Art
Security systems have long used passwords as a means to limit access to a single individual or groups of individuals. Passwords are common place for computer systems, building entry, etc., so that valuable information or materials may be protected.
Most secure systems requiring a password for entry require the user to enter alphanumeric text via a keyboard or keypad. However, with the advent of high quality speech recognition systems, there is a need for an accurate and reliable speaker identification system to allow entry to computer systems or buildings etc., using spoken passwords.
SUMMARY OF THE INVENTION
A speaker identification system is provided that constructs speaker models using a discriminant analysis technique where the data in each class is modeled by Gaussian mixtures. The speaker identification method and apparatus determines the identity of a speaker, as one of a small group, based on a sentence-length password utterance.
A speaker's utterance is received and a sequence of a first set of feature vectors are computed based on the received utterance. The first set of feature vectors are then transformed into a second set of feature vectors using transformations specific to a particular segmentation unit and likelihood scores of the second set of feature vectors are computed using speaker models trained using mixture discriminant analysis. The likelihood scores are then combined to determine an utterance score and the speaker's identity is validated based on the utterance score.
The speaker identification method and apparatus also includes training and enrollment phases. In the enrollment phase the speaker's password utterance is received multiple times. A transcription of the password utterance as a sequence of phones is obtained; and the phone string is stored in a database containing phone strings of other speakers in the group.
In the training phase, the first set of feature vectors are extracted from each password utterance and the phone boundaries for each phone in the password transcription are obtained using a speaker independent phone recognizer. A mixture model is developed for each phone of a given speaker's password. Then, using the feature vectors from the password utterances of all of the speakers in the group, transformation parameters and transformed models are generated for each phone and speaker, using mixture discriminant analysis.
These and other features and advantages of this invention are described in or are apparent from the following detailed description of the preferred embodiments.
REFERENCES:
patent: 5054083 (1991-10-01), Naik et al.
patent: 5615299 (1997-03-01), Bahl et al.
patent: 5687287 (1997-11-01), Gandhi et al.
patent: 5754681 (1998-05-01), Watanabe et al.
patent: 5839103 (1998-11-01), Mammone et al.
patent: 5913192 (1999-06-01), Parthasarathy et al.
patent: 5995927 (1999-11-01), Li
patent: 6029124 (2000-02-01), Gillick et al.
patent: 6233555 (2001-05-01), Parthasarathy et al.
K. Fukunaga,Introduction to Statistical Pattern Recognition, Chapter 10, “Non Linear Mapping”, pp. 288-322, Academic Press, Inc. 1990.
L. Breiman and R. Ihaka, “Nonlinear Discriminant Analysis Via Scaling and ACE,” Technical Report, University of California, Berkeley, 1984.
T. Hastie, R. Tibshirani and Buja, “Flexible Discriminant Analysis by Optimal Scoring,”Journal of the American Statistical Association, 89, pp. 1255-1270, 1994.
T. Hastie and R. Tibshirani, “Discriminant Analysis by Gaussian Mixtures,”Journal of the Royal Statistical Society(Series B), 58, pp. 155-176, 1996.
D.X. Sun, “Feature Dimension Reduction Using Reduced-Rank Maximum Liklihood Extimation For Hidden Markov Models,” Proc. Int. Conference on Spoken Language Processing, pp. 244-247, 1996.
A. E. Rosenberg, O. Siohan, and Parthasarathy, “Small Group Speaker Identification with Common Password Phrases,” submitted to RLA2C, 1998.
Joseph T. Buck, David K. Burton, and John E. Shore, “Text-Dependent Speaker Recognition unrigs Vector Quantization,” Proc. IEEE Int. Conf. Acoust. Speech, and Sig. Proc. ICASSP 85, Mar. 26-29, 1985, pp. 391-394.
A.E. Rosenberg, C.H. Lee, and F.K. Soong, “Sub-Word Unit Talker Verification using Hidden Markov Models,” Proc. 1990 Int. Conf. Acoust. Speech, and Sig. Proc. ICASSP 90, Apr. 3-6, 1990, pp. 269-272.
M. Sharma and R. Mammone, “Subword-based Text-dependent speaker Verification System with User-selectable Passwords,” Proc. 1996 Int. Conf. Acoust. Speech, and Sig. Proc. ICASSP 96, May 7-10, 1996, pp. 93-96.
Parthasarathy Sarangarajan
Rosenberg Aaron E.
AT&T Corp.
Korzuch William
Oliff & Berridg,e PLC
Storm Donald L.
LandOfFree
Method and apparatus for speaker identification using... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for speaker identification using..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for speaker identification using... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2589839