Data processing: speech signal processing – linguistics – language – Linguistics – Natural language
Reexamination Certificate
2006-07-27
2011-11-29
Saint Cyr, Leonard (Department: 2626)
Data processing: speech signal processing, linguistics, language
Linguistics
Natural language
C704S007000, C704S008000, C704S010000
Reexamination Certificate
active
08069032
ABSTRACT:
Biasing of language model customization due to repetitious data is substantially reduced by introducing novelty screening to data harvesting process. Novelty detection based filtering is added to ensure that an adaptation system gives more weight to representative adaptation data that is not repetitious. The value of the adaptation data is preserved and the process prevented from being polluted when the same data is seen multiple times, such as the original posting in an email thread, various versions of the same document, and the like. The screening technique may be built on top of existing data harvesting mechanisms as already seen data is used to determine the novelty of a particular portion of the data. A window into the new data, fixed or variable size, is compared against the already collected data to determine the likelihood that the data is novel.
REFERENCES:
patent: 5899973 (1999-05-01), Bandara et al.
patent: 6167398 (2000-12-01), Wyard et al.
patent: 6278968 (2001-08-01), Franz et al.
patent: 6345249 (2002-02-01), Ortega et al.
patent: 6418431 (2002-07-01), Mahajan et al.
patent: 6442519 (2002-08-01), Kanevsky et al.
patent: 6484136 (2002-11-01), Kanevsky et al.
patent: 6778995 (2004-08-01), Gallivan
patent: 6928404 (2005-08-01), Gopalakrishnan et al.
patent: 6947933 (2005-09-01), Smolsky
patent: 6983247 (2006-01-01), Ringger et al.
patent: 6990628 (2006-01-01), Palmer et al.
patent: 2001/0051868 (2001-12-01), Witschel
patent: 2002/0188446 (2002-12-01), Gao et al.
patent: 2003/0088410 (2003-05-01), Geidl et al.
patent: 2003/0144837 (2003-07-01), Basson et al.
patent: 2005/0165598 (2005-07-01), Cote et al.
patent: 2006/0100876 (2006-05-01), Nishizaki et al.
patent: 2007/0150278 (2007-06-01), Bates et al.
Fabio Brugnara, “Techniques for approximating a trigram language model”, paper appeared in “Spoken language”, 1996, http://ieeexplore.ieee.org/xpl/abs—free.jsp?arNumber=607210, (4 pgs).
IEEE Xplore, “Techniques for approximating a trigram language model”—this is documentation showing the date of the document cited. http://ieeexplore.ieee.org/xpl/absprintf.jsp?arnumber=607210&page=FREE, (1 pg).
Ronald Rosenfeld, “A Maximum Entropy Approach to Adaptive Statistical Language Modeling”, paper appeared in “Spoken language”, May 21, 1996, http://www.cs.cmu.edu/afs/cs.cmu.edu/user/roni/WWW/papers/me-csl-revised.pdf, (37 pgs).
Marcello Federico, “Bayesian estimation methods for n-gram language model adaptation”, paper appeared in Spoken language, 1996, http://ieeexplore.ieee.org/xpl/abs—free.jsp?arNumber=607087, (4 pgs).
IEEE Xplore, “Bayesian estimation methods for n-gram language model adaptation”—this is documentation showing the date of the document cited. http://www.ieee.org/xpl/absprintf.jsp?arnumber=607087&page=FREE, (1 pg).
Brugnara, Fabio; Federico, Marcello; “Techniques for Approximating A Trigram Language Model,” Istituto per la Ricerca Scientifica e Tecnologica, I-38050 Povo, Trento, Italy, pp. 2075-2078.
Rosenfeld; Ronald, “A Maximum Entrophy Approach to Adaptive Statistical Language Modeling,” Computer Science Department, Carnegie Mellon University, Pittsburg, PA, May 21, 1996, pp. 1-37.
Federico, Marcello, “Bayesian Estimation Methods for N-Gram Language Model Adaptation,” Istituto per la Ricerca Scientifica e Tecnologica, 1-38050 Povo, Trento, Italy, pp. 240-243.
Mukerjee Kunal
Odell Julian J.
Cyr Leonard Saint
Merchant & Gould
Microsoft Corporation
LandOfFree
Lightweight windowing method for screening harvested data... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Lightweight windowing method for screening harvested data..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Lightweight windowing method for screening harvested data... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-4309617