Low complexity speaker verification using simplified hidden...

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S243000, C704S245000

Reexamination Certificate

active

06556969

ABSTRACT:

BACKGROUND
1. Technical Field
The present invention relates generally to speaker verification; and, more particularly, it relates to speaker verification employing a combination of universal cohort modeling and automatic score thresholding.
2. Description of Related Art
Conventional systems employing speaker recognition and other automatic speaker verification (ASV) provide a means to ensure secure access to various facilities. The ability to control the flow of personnel to various portions within a facility, without the intervention of man-occupied stations, is also very desirable for many applications. For example, many businesses use card-controlled access or numerical keypads to control the flow of personnel into various portions of a facility. Facility management, when controlling a single building having a number of businesses occupying various portions of the building, often use such means of facility access control to monitor and ensure that various portions of the facility are safe and secure from intruders and other unauthorized personnel. Such personnel recognition system and automatic speaker verification (ASV) systems provide the ability to control the flow of personnel using speech utterances of the personnel. Verbal submission of a predetermined word or phrase or simply a sample of an individual speaker's speaking of a randomly selected word or phrase are provided by a claimant when seeking access to pass through the speaker recognition and other automatic speaker verification (ASV) systems. An authentic claimant is one of the personnel who is authorized to gain access to the facility.
A trend for many of these speaker recognition and other automatic speaker verification (ASV) systems is to employ systems that employ unsupervised training methods to prepare the speaker verification system to operate properly in real time. However, many of the conventional systems require substantial training and processing resources, including memory, to perform adequately. Within such systems, a claimant provides a speech sample or speech utterance that is scored against a model corresponding to the claimant's claimed identity and a claimant score is then computed. There are two commonly known conventional methods known to those having skill in the art of speaker verification to decide whether to accept or reject the claimant; that is to say, whether to permit the claimant to pass through the speaker verification system of to deny the claimant access, i.e., to confirm that the claimant is in fact an authorized member of the personnel of the facility.
A first conventional method to perform speaker verification compares a score that is derived from the claimant provided utterance to a predetermined threshold level. The claimant is subsequently declared to be a true speaker solely upon the determination of whether the claimant's score exceeds the predetermined threshold level. Alternatively, if the claimant's score falls below the predetermined threshold level, the claimant is rejected and denied access through the speaker verification system. Deficiencies in this first conventional method of performing speaker verification are many. Although this first conventional method of performing speaker verification has relatively low computational and storage requirements, it is substantially unreliable. A predominant reason for the unreliability of this first conventional method of performing speaker verification stems from the fact that it is highly biased to the training data, and it is consequently highly biased to the training conditions that existed during its training.
A second conventional method used to perform speaker verification compares the score that is derived from the claimant's utterance to a plurality of scores that are computed during the speaker verification process, i.e., when the claimant claims to be a true speaker or member of the personnel of the facility, namely, an individual speaker authorized to gain access through the speaker identification system. The plurality of scores that are compared to the score that is derived from the claimant provided utterance using the second conventional method to perform speaker verification are generated by scoring the claimant's score against a set of scores extracted from models known cohort speakers. One difficulty, among others, with using the cohort modeling is the required set of cohort models necessitated to perform speaker verification is different for every speaker; consequently, a large amount of processing must be performed to determine the proper cohort model or models for a given claimant. A relatively significant amount of memory is also required to store all of the various cohort models to accommodate all of the speakers of the system. In addition, the method of training the conventional speaker verification system requires access to a relatively large pool of speaker cohort models to select the proper cohort set; the accompanying data storage requirements are typically very large as described above. A problem for speaker verification systems having relatively constrained memory requirements and processing requirements is that their reliability suffers greatly using such conventional methods. Also, the memory management and data processing needs are also great, in that, several cohort scores must be computed for proper verification; these cohort scores are in addition to the claimant's score in the instant case. Conventional speaker verification systems suffer in terms of relatively large memory requirements, an undesirable high complexity, and an unreliability associated with each of the first conventional method and the second conventional method to perform speaker verification.
Further limitations and disadvantages of conventional and traditional systems will become apparent to one of skill in the art through comparison of such systems with the present invention as set forth in the remainder of the present application with reference to the drawings.
SUMMARY OF THE INVENTION
Various aspects of the present invention can be found in an integrated speaker training and speaker verification system that generates a speaker model and a speaker authenticity using a speech utterance provided by a claimant. The integrated speaker training and speaker verification system contains a training circuitry, a memory, a pattern classification circuitry, and a decision logic circuitry. The training circuitry generates the speaker model and a speaker threshold using the speech utterance provided by the claimant. The memory stores the speaker model and the speaker threshold corresponding to the speech utterance provided by the claimant. The memory also stores a number of cohort models. The pattern classification circuitry processes the speech utterance provided by the claimant. The speech utterance is scored against a selected cohort model chosen from the number of cohort models and the speaker model. The decision logic circuitry processes the speech utterance provided by the claimant, and the speech utterance is scored against the speaker threshold. The pattern classification circuitry and the decision logic circuitry operate cooperatively to generate a speaker authenticity.
In certain embodiments of the invention, the integrated speaker training and speaker verification system contains an offline cohort model generation circuitry that generates three cohort models. One of the cohort models is generated using speech utterances of male speakers. Another of the cohort models is generated using speech utterances of female speakers. A third of the cohort models is generated using speech utterances of both male and female speakers. The pattern classification circuitry of the integrated speaker training and speaker verification system is any unsupervised classifier. In certain embodiments of the invention, the integrated speaker training and speaker verification system contains a switching circuitry that selects between a training operation and a testing operation. The speech utterance provided by th

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Low complexity speaker verification using simplified hidden... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Low complexity speaker verification using simplified hidden..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Low complexity speaker verification using simplified hidden... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3100465

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.