Computer voice recognition method verifying speaker identity...

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S232000

Reexamination Certificate

active

06298323

ABSTRACT:

BACKGROUND OF THE INVENTION
The present invention relates generally to methods and apparatus for verifying speakers such as voice recognition systems.
Methods for verifying speakers (hereafter also “speaker verification”) generally use person-specific properties of the human voice as biometric features. The identity check of a person becomes possible with them on the basis of a brief voice (or: speech) sample of the person. In such methods, speaker-specific features are usually extracted from at least one digital voice (or: speech) sample. Acoustic features that reflect the person-specific dimensions of the vocal tract and the typical time sequence of the articulation motions are particularly suitable as such features.
In speech recognition methods, there generally are two different phases, a training phase and a test phase.
In a training phase, expressions prescribable by a user are spoken into an arrangement that implements the method for speaker verification in what are referred to as text-dependent speaker verification methods. Reference feature vectors that contain speaker-specific features extracted from the digital reference voice (or: speech) sample are formed for these reference voice (or: speech) samples. For determining the individual reference feature vectors or, respectively, feature vectors from the voice (or: speech) signals, the respective voice (or: speech) signal is usually divided into small pseudo-stationary sections, which are referred to as frames. The voice (or: speech) signal is assumed to be stationary for the pseudo-stationary sections. The pseudo-stationary sections typically exhibit a time length of about 10 to 20 ms.
In the test phase, at least one feature vector, and usually a plurality of feature vectors, are formed for a spoken voice (or: speech) signal, this or these being compared to at least one reference feature vector from that formed from a recent voice (or: speech) sample, i.e., the voice (or: speech) sample just spoken by the person to be verified. Given an adequately small difference, i.e. given great similarity between the feature vector and the reference feature vector, the speaker is accepted as the speaker to be verified. The tolerance range for the decision as to when a speaker is to be accepted or, respectively, rejected as the speaker to be verified is usually determined in the training phase. However, this range is also freely prescribable during the test phase depending on the required security demands to made of the verification method.
The above-described method wherein a decision as to whether the speaker is accepted as the speaker to be verified is made on the basis of a comparison of the at least one feature vector to the reference feature vector is known from the document: S. Furui, Cepstral Analysis Technique for Automatic Speaker Verification, IEEE Transactions ASSP, Vol. ASSP-29, No. 2, pp. 254-272, April 1981, fully incorporated herein by reference.
A considerable disadvantage of the method described by S. Furui is that the method exhibits considerable uncertainty in the verification of the speaker. The uncertainty results in that a decision threshold for the acceptance or rejection of the speaker must be defined. The definition of the decision threshold ensues only on the basis of voice (or: speech) samples of the user to be verified.
A method for the pre-processing of spoken voice (or: speech) signals in the voice processing as well as basics about feature extraction and feature selection, i.e. basics about the formation of feature vectors for the voice signals, is also known, for example from the document: G. Ruske, Automatische Spracherkennung, Methoden der Klassifikation und Merkmalsextraction, Oldenbourg-Verlag, ISBN 3-486-20877-2, pp. 11-22 and pp. 69-105, 1988, fully incorporated herein by reference.
In addition, B. Kammerer and W. Kupper, Experiments for Isolated Word Recognition, Single-and Two-Layer-Perceptrons, Neural Networks, Vol. 3, pp. 693-706, 1990, full incorporated herein by reference, discloses that a plurality of voice (or: speech) samples be derived from the voice (or: speech) sample by time distortion from a reference voice (or: speech) sample for the formation of a plurality of reference feature vectors in speaker recognition.
SUMMARY OF THE INVENTION
The present invention concerns a method for speaker verification that enables a more dependable speaker verification than the method described by, e.g., S. Furui.
In an embodiment, the invention provides a method for recognizing a speaker with a computer on the basis of at least one voice signal spoken by a speaker comprising the steps of:
a) forming at least one feature vector for the voice signal,
b) comparing the feature vector to at least one reference feature vector that was formed from at least one voice signal of a speaker to be verified,
c) comparing the feature vector to at least one anti-feature vector that was formed from at least one voice signal of another speaker who is not the speaker to be verified,
d) forming a similarity value from the comparisons, a similarity of the feature vector to the reference feature vector and a similarity of the feature vector with the anti-feature vector being described by said similarity value, and
e) classifying the speaker as the speaker to be verified when the similarity value deviates within a prescribed range from a prescribed value.
In such a method at least one feature vector is formed for a voice (or: speech) signal spoken by a speaker. On the one hand, the feature vector is compared to at least one reference feature vector that was formed from a voice (or: speech) sample spoken previously by the speaker to be verified. Further, the feature vector is compared to at least one anti-feature vector that was formed from a voice (or: speech) sample spoken by a speaker not to be verified. Herein, the term “anti-feature vector” is meant to convey that such a vector is formed from the voice (or: speech) of someone not the speaker to be verified so that such a vector has features recognizable as different from those of the speaker to be verified. A similarity value is determined from the comparisons of the feature vector to the reference feature vector and to the anti-feature vector, and the speaker is classified or not classified as the speaker to be verified dependent on the similarity value.
A considerably more exact, simpler and, thus, faster speaker verification is achieved by considering at least one “anti-example” for the reference feature vector that was spoken by the speaker to be verified. This results in that it is not only the similarity of the spoken voice (or: speech) signal to a voice (or: speech) sample previously spoken by the speaker to be verified that is compared. The spoken voice (or: speech) signal is also compared to voice (or: speech) samples that derive from other speakers who are not the speaker to be verified. What is referred to as a 2 class classification problem results, this leading to enhanced precision and, thus, dependability of the verification result.
In an embodiment, the invention provides a method for recognizing a speaker with a computer on the basis of at least one voice signal spoken by a speaker comprising the steps of:
a) forming at least one feature vector for the voice signal,
b) comparing the feature vector to at least one reference feature vector that was formed from at least one voice signal of a speaker to be verified,
c) comparing the feature vector to at least one anti-feature vector that was formed from at least one voice signal of another speaker who is not the speaker to be verified,
d) reiterating steps a) through c) for a spoken sequence of voice signals,
e) forming an overall similarity value by means of an overall comparison of at least a part of the sequence of voice signals to the corresponding feature vectors ad the corresponding anti-feature vectors, and
e) classifying the speaker as the speaker to be verified when the overall similarity value deviates within a prescribed range from a prescribed value.
In such a method the previously describ

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Computer voice recognition method verifying speaker identity... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Computer voice recognition method verifying speaker identity..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Computer voice recognition method verifying speaker identity... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2600102

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.