Speech detection and enhancement using audio/video fusion

Data processing: speech signal processing – linguistics – language – Speech signal processing – Application

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S240000

Reexamination Certificate

active

10608988

ABSTRACT:
A system and method facilitating speech detection and/or enhancement utilizing audio/video fusion is provided. The present invention fuses audio and video in a probabilistic generative model that implements cross-model, self-supervised learning, enabling rapid adaptation to audio visual data. The system can learn to detect and enhance speech in noise given only a short (e.g., 30 second) sequence of audio-visual data. In addition, it automatically learns to track the lips as they move around in the video.

REFERENCES:
patent: 2003/0110038 (2003-06-01), Sharma et al.
patent: 2004/0088272 (2004-05-01), Jojic et al.
J.W. Fisher III, T. Darrell, W.T. Freeman, and P. Viola. Learning Joint Statistical Models for Audio-Visual Fusion and Segregation. In Advances in Neural Information Processing Systems 13, MIT Press, Dec. 2000.
W.H. Sumby and Irwin Pollack. Visual Contribution to Speech Intelligibility in Noise. The Journal of the Acoustical Society of America. vol. 26, No. 2, pp. 212-215, Mar. 1954.
H. Attias, A. Acero, J.C. Platt, and L. Deng, Speech Denoising and Dereverberation using Probabalisitic Models, Microsoft Research, 2002, 7 pages.
M.J. Beal, H. Attias, and N. Jojic. Audio-video Sensor Fusion with Probabalistic Graphical Models, Microsoft Research, 2002. 15 pages.
V.R. De Sa and D. Ballard. Category Learning through Multi-Modality Sensing. In Neural Computation, 10(5), 1998. 24 pages.
Brendan Frey and Nebojsa Jojic. Estimating Mixture Models of Images and Inferring Spatial Transformations using the EM Algorithm, In Computer Vision and Pattern Recognition(CVPR), 1999, 7 pages.
J. Hershey and M. Casey, Audio-visual Sound Separation via Hidden Markov Models. In T.G. Dietterich, S. Becker, and Z. Ghahramani, editors, Advances in Neural Information Processing Systems 14, pp. 1173-1180, Cambridge, MA, 2002, MIT Press.
J. Hershey and J.R. Movellan, Audio Vision: Using Audio-visual Synchrony to Locate Sounds. In in Advances in Neural Information Processing Systems 12. S.A. Solla, T.K. Leen, and K.R. Muller(eds.), pp. 813-819, MIT Press, 2000.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Speech detection and enhancement using audio/video fusion does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Speech detection and enhancement using audio/video fusion, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Speech detection and enhancement using audio/video fusion will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3729459

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.