Data cleansing system and method

Data processing: speech signal processing – linguistics – language – Linguistics – Translation machine

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S009000

Reexamination Certificate

active

07729899

ABSTRACT:
An automated system and method is provided for debugging training data used to train an automated language identifier. The system and method collects texts written in a particular language, generates an occurrence count for words in each text by counting the number of times each of the words is found within the text, and generates an occurrence ratio (OR) of each of the words by dividing the occurrence count by the total number of words in each text. Words are then filtered from the texts in which their occurrence ratios are substantially higher than their occurrence ratios in at least one of the other texts, to generate a clean text.

REFERENCES:
patent: 5062143 (1991-10-01), Schmitt
patent: 5548507 (1996-08-01), Martino et al.
patent: 6167369 (2000-12-01), Schulze
patent: 6606659 (2003-08-01), Hegli et al.
patent: 6704698 (2004-03-01), Paulsen, Jr. et al.
patent: 6925432 (2005-08-01), Lee et al.
patent: 7191116 (2007-03-01), Alpha
patent: 7386438 (2008-06-01), Franz et al.
patent: 2002/0116291 (2002-08-01), Grasso et al.
patent: 2003/0176996 (2003-09-01), Lecarpentier
patent: 2004/0002994 (2004-01-01), Brill et al.
patent: 2005/0120011 (2005-06-01), Dehlinger et al.
patent: 2006/0047617 (2006-03-01), Bacioiu et al.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Data cleansing system and method does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Data cleansing system and method, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Data cleansing system and method will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-4249320

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.