Speaker recognition using a hierarchical speaker model tree

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

06684186

ABSTRACT:

BACKGROUND OF THE INVENTION
The present invention relates generally to the field of speaker recognition, which includes speaker verification and speaker identification.
The use of speaker verification systems for security purposes has been growing in recent years. In a conventional speaker verification system, speech samples of known speakers are obtained and used to develop some sort of speaker model for each speaker. Each speaker model typically contains clusters or distributions of audio feature data derived from the associated speech sample. In operation of a speaker verification system, a person (the claimant) wishing to, e.g., access certain data, enter a particular building, etc., claims to be a registered speaker who has previously submitted a speech sample to the system. The verification system prompts the claimant to speak a short phrase or sentence. The speech is recorded and analyzed to compare it to the stored speaker model with the claimed identification (ID). If the speech is within a predetermined distance (closeness) to the corresponding model, the speaker is verified.
Speaker identification systems are also enjoying considerable growth at the present time. These systems likewise develop and store speaker models for known speakers based on speech samples. Subsequently, to identify an unknown speaker, his speech is analyzed and compared to the stored models. If the speech closely matches one of the models, the speaker is positively identified. Among the many useful applications for such speaker identification systems is in the area of speech recognition. Some speech recognition systems achieve more accurate results by developing unique speech prototypes for each speaker registered with the system. The unique prototype is used to analyze only the speech of the corresponding person. Thus, when the speech recognition system is faced with the task of recognizing speech of a speaker who has not identified himself, such as in a conference situation, a speaker identification process can be carried out to determine the correct prototype for the recognition operation.
SUMMARY OF THE DISCLOSURE
The present disclosure relates to a method for generating a hierarchical speaker model tree. In an illustrative embodiment, a speaker model is generated for each of a number of speakers from which speech samples have been obtained. Each speaker model contains a collection of distributions of audio feature data derived from the speech sample of the associated speaker. The hierarchical speaker model tree is created by merging similar speaker models on a layer by layer basis. Each time two or more speaker models are merged, a corresponding parent speaker model is created in the next higher layer of the tree. The tree is useful in applications such as speaker verification and speaker identification.
A speaker verification method is disclosed in which a claimed ID from a claimant is received, where the claimed ID represents a speaker corresponding to a particular one of the speaker models. A cohort set of similar speaker models associated with the particular speaker model is established. Then, a speech sample from the claimant is received and a test speaker model is generated therefrom. The test model is compared to all the speaker models of the cohort set, and the claimant speaker is verified only if the particular speaker model is closest to the test model. False acceptance rates can be improved by computing one or more complementary speaker models and adding the complementary model(s) to the cohort set for comparison to the test model. In a cumulative complementary model (CCM) approach, one merged complementary model is generated from speaker models outside the original cohort set, and then added to the cohort set. In a graduated complementary model (GCM) approach, a complementary model is defined for each of a number of levels of the tree, with each complementary model being added to the cohort set.


REFERENCES:
patent: 3936805 (1976-02-01), Bringol et al.
patent: 5522012 (1996-05-01), Mammone et al.
patent: 5550966 (1996-08-01), Drake et al.
patent: 5598507 (1997-01-01), Kimber et al.
patent: 5649060 (1997-07-01), Ellozy et al.
patent: 5655058 (1997-08-01), Balasubramanian et al.
patent: 5659662 (1997-08-01), Wilcox et al.
patent: 5857169 (1999-01-01), Seide
patent: 5864810 (1999-01-01), Digalakis et al.
patent: 6006184 (1999-12-01), Yamada et al.
patent: 6073096 (2000-06-01), Gao et al.
patent: 6073101 (2000-06-01), Maes
patent: 6108628 (2000-08-01), Komori et al.
patent: 6141641 (2000-10-01), Hwang et al.
patent: 6173076 (2001-01-01), Shinoda
IBM Technical Disclosure Bulletin, vol. 38, No. 01, Jan. 1995, 2 pages.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Speaker recognition using a hierarchical speaker model tree does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Speaker recognition using a hierarchical speaker model tree, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Speaker recognition using a hierarchical speaker model tree will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3248731

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.