Trigram-based method of language identification

Image analysis – Histogram processing – For setting a threshold

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

382 39, 382 40, G06K 962, G06K 972

Patent

active

050621432

ABSTRACT:
A mechanism for examining a body of text and identifying its language compares successive trigrams into which the body of text is parsed with a library of sets of trigrams. For a respective language-specific key set of trigrams, if the ratio of the number of trigrams in the text, for which a match in the key set has been found, to the total number of trigrams in the text is at least equal to a prescribed value, then the text is identified as being possibly written in the language associated with that respective key set. Each respective trigram key set is associated with a respectively different language and contains those trigrams that have been predetermined to occur at a frequency that is at least equal to a prescribed frequency of occurrence of trigrams for that respective language. Successive key sets for other languages are processed as above, and the language for which the percentage of matches is greatest, and for which the percentage exceeded the prescribed value as above, is selected as the language in which the body of text is written.

REFERENCES:
patent: 3969698 (1976-07-01), Bollinger et al.
patent: 4754489 (1988-06-01), Bokser et al.
patent: 4829580 (1989-05-01), Church

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Trigram-based method of language identification does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Trigram-based method of language identification, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Trigram-based method of language identification will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-1404686

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.