Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition
Reexamination Certificate
2007-06-29
2011-11-01
Jackson, Jakieda (Department: 2626)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Recognition
C704S231000, C704S230000
Reexamination Certificate
active
08050919
ABSTRACT:
A speaker recognition system generates a codebook store with codebooks representing voice samples of speaker, referred to as trainers. The speaker recognition system may use multiple classifiers and generate a codebook store for each classifier. Each classifier uses a different set of features of a voice sample as its features. A classifier inputs a voice sample of an person and tries to authenticate or identify the person. A classifier generates a sequence of feature vectors for the input voice sample and then a code vector for that sequence. The classifier uses its codebook store to recognize the person. The speaker recognition system then combines the scores of the classifiers to generate an overall score. If the score satisfies a recognition criterion, then the speaker recognition system indicates that the voice sample is from that speaker.
REFERENCES:
patent: 5412738 (1995-05-01), Brunelli et al.
patent: 5666466 (1997-09-01), Lin et al.
patent: 5794190 (1998-08-01), Linggard et al.
patent: 6094632 (2000-07-01), Hattori
patent: 6182037 (2001-01-01), Maes
patent: 6330536 (2001-12-01), Parthasarathy et al.
patent: 6477500 (2002-11-01), Maes
patent: 6519561 (2003-02-01), Farrell et al.
patent: 6529871 (2003-03-01), Kanevsky et al.
patent: 6760701 (2004-07-01), Sharma et al.
patent: 2003/0182118 (2003-09-01), Obrador et al.
patent: 2005/0096906 (2005-05-01), Barzilay
patent: 2005/0187916 (2005-08-01), Levin et al.
patent: 2006/0111904 (2006-05-01), Wasserblat et al.
“The 2002 SuperSID Workshop on Speaker Recognition”,The Center for Language and Speech Processing, Oct. 24, 2003. <http:www.clsp.jhu.edu/ws2002/groups/supersid/>.
Aggarwal, et al. “A System Identification Approach for Video-based Face Recognition”,Proceedings of ICPR, 2004. pp. 23-26.
Brunelli, et al. “Person Identification Using Multiple Cues”,IEEE Transactions on Pattern Analysis and Machine Intelligence(PAMI), vol. 17, No. 10, Oct. 1999. pp. 955-966.
Buck, et al. “Text-Dependent Speaker Recognition Using Vector Quantization”, Apr. 1985. pp. 391-394.
Buik, et al. “Face Recognition from Multi-Pose Image Sequence”,Proceedings of 2nd International Symposium on Image and Signal Processing, 2001. pp. 319-324.
Campbell. “Speaker Recognition: A Tutorial”,Proceedings of the IEEE, vol. 85, No. 9, Sep. 1997 (26 pages).
Chibelushi, et al. “A Review of Speech Based Bimodal Recognition”,IEEE Transactions on Multimedia, vol. 4, No. 1, Mar. 2002. pp. 23-37.
Das, et al. “Audio-Visual Biometric Recognition by Vector Quantization”,IEEE International Workshop on Spoken Language Technology, Aruba, Dec. 2006, to Appear.
Das, et al. “Face Recognition from Images with High Pose Variations by Transform Vector Quantization”,Computer Vision, Graphics and Image Processing. vol. 4338-2006, Lecture Notes in Computer Science, Springer Berlin Heidelberg. Copyright 2006, pp. 674-685.
Das, et al. “Text-dependent speaker recognition: A survey and state of the art”, 2006IEEE International Conference on Acoustics, Speech, and Signal Processing(ICASSP), May 14-19, 2006, Toulouse, France.
Das. “Audio Visual Person Authentication by Multiple Nearest Neighbor Classifiers”,Advances in Biometrics, Lecture Notes in Computer Science, vol. 4642.2007. Copyright 2007, pp. 1114-1123.
Das. “Speaker Recognition by Text Dependent Vector Quantization with Password Conditioning”, Microsoft Research—India. Bangalore, India. 2007.
Gersho, Allen, and Gray, Robert M.Vector Quantization and Signal Compression. Springer, 1992. 760 pgs.
Gong, et al. “Tracking and Recognition of Face Sequences”, European Workshop on Combined Real and Synthetic Image Processing for Broadcast and Video Production, 1994.
Hafed, et al. “Face Recognition Using the Discrete Cosine Transform”,International Journal of Computer Vision, vol. 43, Issue 3, Jul./Aug. 2001. pp. 167-188.
Howell, et al. “Towards Unconstrained Face Recognition from Image Sequences”,Proceedings of the International Conference on Automatic Face Recognition, 1996. 224-229.
Kanak, at al. “Joint Audio Video Processing for Biometric Speaker Identification”,Proceedings MMUA-06, May 2006.
Kinnunen, et al. “Real-Time Speaker Identification and Verification”,IEEE Transactions on Audio, Speech and Language Processing, vol. 14, No. 1, Jan. 2006.
Kinnunen, at al. “Speaker Discriminative Weighting Method for VQ-Based Speaker Identification”,Proceedings AVBPA 2001, pp. 150-156, Halmstad, Sweden, 2001.
Kittler, et al. “Combining Evidence in Multimodal Personal Identity Recognition Systems”,Proceedings of the International Conference on Audio- and Video-Based Biometric Person Authentication, Crans Montana, Switzerland, 1997.
Krueger, et al. “Exemplar-based Face Recognition from Video”,Proceedings of ECCV, 2002. pp. 732-746.
Lee, et al. “Video-based face recognition using probabilistic appearance manifolds”,Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR '03), vol. 1, pp. 313-320.
Li, et al. “Automatic Verbal Information Verification for User Authentication”,IEEE Transactions on Speech and Audio Processing, vol. 8, No. 5, Sep. 2000.
Li, et al. “Video-Based Online Face Recognition Using Identity Surfaces”, Technical Report, Queen Mary, University of London. 2001.
Marcel, et al. “Bi-Modal Face & Speech Authentication: A BioLogin Demonstration System”, Proceedings MMUA-06, May 2006.
Matsui, et al. “A Text-Independent Speaker Recognition Method Robust Against Utterance Variations”,Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, (ICASSP)1991. ICASSP-91., 1991 pp. 377-380. ISBN:0-7803-0003-3.
Moghaddam, et al. “Face Recognition using View-Based and Modular Eigensspaces”,Automatic Systems for the Identification and Inspection of Humans, SPIE vol. 2277, Jul. 1994.
Phillips, et al. “Face Recognition Vendor Test 2002: Evaluation Report”, Technical Report NISTIR 6965, Mar. 2003. <http://www.frvt.org>.
Ramasubramanian, et al. “Text-Dependent Speaker-Recognition Using One-Pass Dynamic Programming”,Proceedings ICASSP'06, Toulouse, France, May 2006.
Rao, K.R., Yip, Patrick, Britanak, Vladimir.Discrete Cosine Transform—Algorithms, Advantages, Applications. Academic Press Professional, Inc., 1990. San Diego, CA, USA.
Rosenberg, et al. “Evaluation of a Vector Quantization Talker Recognition System in Text Independent and Text Dependent Modes”,ICASSP 86, Tokyo. 1986 IEEE pp. 873-876.
Soong. “A Vector Quantization Approach to Speaker Recognition”,Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '85. </xpl/RecentCon.jsp?punumber=8361>. Publication Date: Apr. 1985. vol. 10, pp. 387-390 Posted online: Jan. 29, 2003 10:25:51.0.
Sviridenko. “Speaker Verification and Identifcation Systems from SPIRIT Corp.”, 2000 (8 pages).
Turk, et al. “Eigenfaces for Recognition”,Journal of Cognitive Neuroscience, 3(1). 1991 pp. 71-86.
Yang, et aI. “Detecting Faces in Images: A Survey”,IEEE Transactions on Pattern Analysis and Machine Intelligence(PAMI), vol. 24, No. 1, Jan. 2002. 34-58.
Zhao, et al. “Face Recognition: A Literature Survey,”ACM Computing Surveys, pp. 399-458, 2003. (Also appeared as UMD Technical Report, CS-TR4167, <ftp://ftp.cfar.umd.edu/TRS/FaceSurvey.ps.qz> 2000. Revised 2002, CS-TR4167R:.
Zhou, et al. “Probabilistic recognition of human faces from video”,Computer Vision and Image Understanding, vol. 91. (2003) 214-215.
Jackson Jakieda
Microsoft Corporation
Perkins Coie LLP
LandOfFree
Speaker recognition via voice sample based on multiple... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Speaker recognition via voice sample based on multiple..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Speaker recognition via voice sample based on multiple... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-4300608