Method and apparatus for multi-environment speaker verification

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Method and apparatus for multi-environment speaker verification Method and apparatus for multi-environment speaker verification

: 1999-01-29
: 2001-06-26
: Korzuch, William R. (Department: 2741)
: Data processing: speech signal processing, linguistics, language
: Speech signal processing
: Recognition

: C704S246000
: Reexamination Certificate
: active
: 06253179
: ABSTRACT:

TECHNICAL FIELD
The present invention relates generally to the field of speaker verification.
BACKGROUND OF THE INVENTION
The use of speaker verification systems for security and other purposes has been growing in recent years. In a conventional speaker verification system, speech samples of known speakers are obtained and used to develop some sort of speaker model for each speaker. Each speaker model typically contains clusters or distributions of audio feature data derived from the associated speech sample. In operation of a speaker verification system, a person (the claimant) wishing to, e.g., access certain data, enter a particular building, etc., claims to be a registered speaker who has previously submitted a speech sample to the system. The verification system prompts the claimant to speak a short phrase or sentence. The speech is recorded and analyzed to compare it to the stored speaker model with the claimed identification (ID). If the speech is within a predetermined distance (closeness) to the corresponding model, the speaker is verified.
The environment in which the speech is sampled influences the characteristics of the recorded speech data, both for training data and test data. Thus, one of the design issues of a speaker verification system is how to account for the different environments in which training data and test data (of a claimant) are taken. Varying channels, e.g., different types of microphones, telephones or communication links, affect the parameters of a person's speech on the receiving end. In many speech verification systems, it must be assumed that any source of speech can be received over any one of a number of channels. Thus, any modifications that the channels cause in the source data must be accounted for, a procedure referred to as environment normalization.
Current approaches to channel (environment) normalization involve, in one form or another, a supervised training phase to separate and group the training and/or testing data according to a predetermined set of “models” corresponding to each of the channels. Channel dependent background models and statistics are then derived from these groups. A number of existing techniques compare received data to the claimed source model in light of the various background models. A different approach involves trying to make the data received over any of the channels look as if it was received over some canonical channel, thus mitigating the influence of the channel. Here again, the channels must be known so that they can be inverted. A shortcoming of these supervised training techniques is that, in some applications, they are unrealistic because of the requirement that each channel that may be used must be modeled and known ahead of time.
For other pattern matching problems aside from speech verification, environment normalization is likewise a problem that needs to be addressed. The general problem, which includes the speaker verification situation, is how to accept two patterns as being similar when the comparisons are (or may be) performed under mismatched conditions. The mismatched conditions may be, for example, different lighting conditions or shadows for face recognition; different noise conditions for image recognition; different foreground and lighting noise for background texture recognition; and different reception channels for speaker recognition.
SUMMARY OF THE DISCLOSURE
The present disclosure relates to a method for unsupervised environmental normalization for speaker verification using hierarchical clustering. In an illustrative embodiment, training data (speech samples) are taken from T enrolled (registered) speakers over any one of M channels, e.g., different microphones, communication links, etc. For each speaker, a speaker model is generated, each containing a collection of distributions of audio feature data derived from the speech sample of that speaker. A hierarchical speaker model tree is created, e.g., by merging similar speaker models on a layer by layer basis. Each speaker is also grouped into a cohort of similar speakers. For each cohort, one or more complementary speaker models are generated by merging speaker models outside that cohort. The complementary speaker model(s) is used to reduce false acceptances during a subsequent speaker verification operation.
When training data from a new speaker to be enrolled is received over a new channel, the speaker model tree as well as the complementary models are updated. Thus, adaptation to data from new environments is possible by incorporating such data into the verification model whenever it is encountered.

REFERENCES:
patent: 5687287 (1997-11-01), Gandhi et al.
patent: 5806029 (1998-09-01), Buhrke et al.
patent: 5963906 (1999-10-01), Turin
patent: 6006184 (1999-12-01), Yamada et al.
patent: 6038528 (2000-03-01), Mammone et al.
patent: 6058205 (2000-05-01), Bahl et al.
patent: 6073096 (2000-06-01), Gao et al.
patent: 6073101 (2000-06-01), Maes
patent: 6081660 (2000-06-01), Macleod et al.
patent: 6107935 (2000-08-01), Comerford et al.
Rosenberg et al., “Speaker background models for connected digit password speaker verification,” 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, May 1996, pp. 81 to 84.*
Li et al., “Normalized discriminant analysis with application to a hybrid speaker-verification system,” 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, May 1996, pp. 681 to 684.

Affiliated with

Beigi Homayoon S.

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Chaudhari Upendra V.

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Maes Stephane H.

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Sorensen Jeffrey S.

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

F. Chau & Associates LLP

Law Firm

[ 0.00 ] – not rated yet Voters 0 Comments 0

International Business Machines - Corporation

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

Korzuch William R.

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Lerner Martin

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for multi-environment speaker verification does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for multi-environment speaker verification, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for multi-environment speaker verification will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-2523915

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure