Data processing: speech signal processing – linguistics – language – Linguistics – Multilingual or national language support
Reexamination Certificate
2008-04-15
2008-04-15
Hudspeth, David (Department: 2626)
Data processing: speech signal processing, linguistics, language
Linguistics
Multilingual or national language support
C704S009000
Reexamination Certificate
active
07359851
ABSTRACT:
A method and system identifying the language of a textual passage is disclosed. The method and system includes parsing the textual passage into n-grams and assigning an initial weight to each n-gram, and adjusting the weight initially assigned to a word or n-gram parsed from the textual passage. The initially assigned weight is adjusted in a manner proportionate to the inverse of the number of languages within which such words or n-grams appear. Reducing the weight assigned to such words or n-grams diminishes—without completely eliminating—their importance in comparison to other words or n-grams parsed from the same textual passage when determining the language of a passage. The method and system of the present invention appropriately weighs the short words or n-grams common to multiple languages without affecting the short words or n-grams that are uncommon to several languages.
REFERENCES:
patent: 5062143 (1991-10-01), Schmitt
patent: 5418951 (1995-05-01), Damashek
patent: 5548507 (1996-08-01), Martino et al.
patent: 6009382 (1999-12-01), Martino et al.
patent: 6023670 (2000-02-01), Martino et al.
patent: 6076051 (2000-06-01), Messerly et al.
patent: 6167369 (2000-12-01), Schulze
patent: 6216102 (2001-04-01), Martino et al.
patent: 6272456 (2001-08-01), de Campos
Cavnar et al (“N-Gram Based Text Categorization”, Proceedings of SDAIR-94, 3rd Annual Symposium on Document Analysis and Information Retrieval, 1994).
Grefenstette, Gregory, “Comparing Two Language Identification Schemes,” 3rdInternational Conference on Statistical Analysis of Textual Data (JADT), Rome, Italy; Dec. 1995, vol. 2, pp. 263-268.
Evans David A.
Grefenstette Gregory T.
Tong Xiang
Clairvoyance Corporation
Harper Blaney
Hudspeth David
Jones Day
Neway Samuel G
LandOfFree
Method of identifying the language of a textual passage... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method of identifying the language of a textual passage..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method of identifying the language of a textual passage... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2753759