Lightweight windowing method for screening harvested data...

Data processing: speech signal processing – linguistics – language – Linguistics – Natural language

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S007000, C704S008000, C704S010000

Reexamination Certificate

active

08069032

ABSTRACT:
Biasing of language model customization due to repetitious data is substantially reduced by introducing novelty screening to data harvesting process. Novelty detection based filtering is added to ensure that an adaptation system gives more weight to representative adaptation data that is not repetitious. The value of the adaptation data is preserved and the process prevented from being polluted when the same data is seen multiple times, such as the original posting in an email thread, various versions of the same document, and the like. The screening technique may be built on top of existing data harvesting mechanisms as already seen data is used to determine the novelty of a particular portion of the data. A window into the new data, fixed or variable size, is compared against the already collected data to determine the likelihood that the data is novel.

REFERENCES:
patent: 5899973 (1999-05-01), Bandara et al.
patent: 6167398 (2000-12-01), Wyard et al.
patent: 6278968 (2001-08-01), Franz et al.
patent: 6345249 (2002-02-01), Ortega et al.
patent: 6418431 (2002-07-01), Mahajan et al.
patent: 6442519 (2002-08-01), Kanevsky et al.
patent: 6484136 (2002-11-01), Kanevsky et al.
patent: 6778995 (2004-08-01), Gallivan
patent: 6928404 (2005-08-01), Gopalakrishnan et al.
patent: 6947933 (2005-09-01), Smolsky
patent: 6983247 (2006-01-01), Ringger et al.
patent: 6990628 (2006-01-01), Palmer et al.
patent: 2001/0051868 (2001-12-01), Witschel
patent: 2002/0188446 (2002-12-01), Gao et al.
patent: 2003/0088410 (2003-05-01), Geidl et al.
patent: 2003/0144837 (2003-07-01), Basson et al.
patent: 2005/0165598 (2005-07-01), Cote et al.
patent: 2006/0100876 (2006-05-01), Nishizaki et al.
patent: 2007/0150278 (2007-06-01), Bates et al.
Fabio Brugnara, “Techniques for approximating a trigram language model”, paper appeared in “Spoken language”, 1996, http://ieeexplore.ieee.org/xpl/abs—free.jsp?arNumber=607210, (4 pgs).
IEEE Xplore, “Techniques for approximating a trigram language model”—this is documentation showing the date of the document cited. http://ieeexplore.ieee.org/xpl/absprintf.jsp?arnumber=607210&page=FREE, (1 pg).
Ronald Rosenfeld, “A Maximum Entropy Approach to Adaptive Statistical Language Modeling”, paper appeared in “Spoken language”, May 21, 1996, http://www.cs.cmu.edu/afs/cs.cmu.edu/user/roni/WWW/papers/me-csl-revised.pdf, (37 pgs).
Marcello Federico, “Bayesian estimation methods for n-gram language model adaptation”, paper appeared in Spoken language, 1996, http://ieeexplore.ieee.org/xpl/abs—free.jsp?arNumber=607087, (4 pgs).
IEEE Xplore, “Bayesian estimation methods for n-gram language model adaptation”—this is documentation showing the date of the document cited. http://www.ieee.org/xpl/absprintf.jsp?arnumber=607087&page=FREE, (1 pg).
Brugnara, Fabio; Federico, Marcello; “Techniques for Approximating A Trigram Language Model,” Istituto per la Ricerca Scientifica e Tecnologica, I-38050 Povo, Trento, Italy, pp. 2075-2078.
Rosenfeld; Ronald, “A Maximum Entrophy Approach to Adaptive Statistical Language Modeling,” Computer Science Department, Carnegie Mellon University, Pittsburg, PA, May 21, 1996, pp. 1-37.
Federico, Marcello, “Bayesian Estimation Methods for N-Gram Language Model Adaptation,” Istituto per la Ricerca Scientifica e Tecnologica, 1-38050 Povo, Trento, Italy, pp. 240-243.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Lightweight windowing method for screening harvested data... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Lightweight windowing method for screening harvested data..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Lightweight windowing method for screening harvested data... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-4309617

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.