Low latency real-time speech transcription

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S256400, C704S231000, C704S236000, C704S251000, C704S257000, C704S240000

Reexamination Certificate

active

07930181

ABSTRACT:
Systems and methods for low-latency real-time speech recognition/transcription. A discriminative feature extraction, such as a heteroscedastic discriminant analysis transform, in combination with a maximum likelihood linear transform is applied during front-end processing of a digital speech signal. The extracted features reduce the word error rate. A discriminative acoustic model is applied by generating state-level lattices using Maximum Mutual Information Estimation. Recognition networks of language models are replaced by their closure. Latency is reduced by eliminating segmentation such that a number of words/sentences can be recognized as a single utterance. Latency is further reduced by performing front-end normalization in a causal fashion.

REFERENCES:
patent: 5995561 (1999-11-01), Yamasaki et al.
patent: 6490553 (2002-12-01), Van Thong et al.
patent: 6507815 (2003-01-01), Yamamoto
patent: 6539351 (2003-03-01), Chen et al.
patent: 6609093 (2003-08-01), Gopinath et al.
patent: 6785645 (2004-08-01), Khalil et al.
patent: 6944588 (2005-09-01), Kempe
patent: 6952667 (2005-10-01), Kempe
patent: 6957184 (2005-10-01), Schmid et al.
patent: 6959273 (2005-10-01), Kempe
patent: 6961693 (2005-11-01), Kempe
patent: 7007001 (2006-02-01), Oliver et al.
patent: 7454341 (2008-11-01), Pan et al.
patent: 2003/0009335 (2003-01-01), Schalkwyk et al.
patent: 2003/0061458 (2003-03-01), Wilcox et al.
patent: 2003/0139926 (2003-07-01), Jia et al.
N. Kumar et al,Heteroscedastic Discriminant Analysis and Reduced Rank HMMs for Improved Speech Recognition, Speech Communication, vol. 26, pp. 283-297, 1998.
G. Soan et al.,Maximum Likelihood Discriminant Feature Spaces, in Proc. ICASSP, pp. 1129-1132, 2000.
G. Cook et al.,Real-time Recognition of Broadcast Radio Speech, in Proc. ICASSP, pp. 141-144, 1996.
J. Glass et al.,Realtime Telephone-based Speech Recognition in the Jupiter Domain, in Proc. ICASSP, pp. 61-64, 1999.
M.J.F. Gales,Maximum Likelihood Linear Transformations for HMM-based Speech Recognition, Tech. Rep. CUED/FINFENG/TR291, Cambridge, University, 1997.
M. Siegler, et al.,Automatic Segmentation and Clustering of Broadcast News Audio, in Proc. DARPA Speech Recognition Workshop, pp. 97-99, 1997. Code available at http://www.nist.gov/speech/tols/CMUseg—05targz.htm.
Mehryar Mohri et al.,Integrated Context-Dependent Networks in Very Large Vocabulary Speech Recognition, in Proc. Eurospeech, 1999.
Mehryar Mohri et al.,Weighted Finite-State Transducers in Speech Recognition, in Proc, ISCA ITRW ASR 2000, 2000.
Mehryar Mohri et al.,A Weight Pushing Algorithm for Large Vocabulary Speech Recognition, a Proc. Eurospeech, 2001.
R. A. Gopinath,Maximum Likelihood Modeling with Gaussian Distributions for Classifications, inProc. ICASSP, 1998, pp. 661-664.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Low latency real-time speech transcription does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Low latency real-time speech transcription, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Low latency real-time speech transcription will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2719705

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.