Data processing: speech signal processing – linguistics – language – Linguistics – Natural language
Reexamination Certificate
2006-08-15
2006-08-15
Dorvil, Richemond (Department: 2655)
Data processing: speech signal processing, linguistics, language
Linguistics
Natural language
C704S270000, C382S181000, C382S186000
Reexamination Certificate
active
07092870
ABSTRACT:
A system and method for indexing and searching textual archives using semantic units such as syllables and morphemes. In one aspect, a system for indexing a textual archive comprises an AHR (automatic handwriting recognition) system and/or OCR (optical character recognition) system for transcribing (decoding) textual input data (handwritten or typed text) into a string of semantic units (e.g., syllables or morphemes) using a statistical language model and vocabulary based on semantic units (such as syllables or morphemes). The string of semantic units that result from a decoding process are stored in a semantic unit database and indexed with pointers to the corresponding textual data in the textual archive. In another aspect, a system for searching a textual archive is provided, wherein a word (or words) to be searched is rendered into a string of semantic units (e.g., syllables or morphemes) depending on the application. A search engine then compares the string of semantic units (resulting from the input query) against the decoded semantic unit database, and then identifies textual data stored in the textual archive using the indexes that were generated during a semantic unit-based indexing process.
REFERENCES:
patent: 4674066 (1987-06-01), Kucera
patent: 5268840 (1993-12-01), Chang et al.
patent: 5319745 (1994-06-01), Vinsonneau et al.
patent: 5577135 (1996-11-01), Grajski et al.
patent: 5778361 (1998-07-01), Nanjo et al.
patent: 5805747 (1998-09-01), Bradford
patent: 5832478 (1998-11-01), George
patent: 5857099 (1999-01-01), Mitchell et al.
patent: 5933525 (1999-08-01), Makhoul et al.
patent: 5953451 (1999-09-01), Syeda-Mahmood
patent: 5960447 (1999-09-01), Holt et al.
patent: 5963893 (1999-10-01), Halstead et al.
patent: 6374210 (2002-04-01), Chu
patent: 6470334 (2002-10-01), Umemoto
patent: 6879951 (2005-04-01), Kuo
patent: 2003/0200211 (2003-10-01), Tada et al.
Nguyen, Y., Vines, P., Wilkinson, R., A Comparison of Morpheme and Word Based Document Retrieval for Asian Languages, Database and Expert Systems Applications, 1996. Procedings., Seventh International Conference on., pp. 291-296.
Hackett et al., “Comparison of word-based and syllable-based retrieval for Tibetan”, Proceedings of the fifth international workshop on on Information retrieval with Asian languages, Nov. 2000, pp. 197-198.
Hahn, et al., “A study on utilizing OCR technology in building text database”, Tenth International Workshop on Database and Expert Systems Applications, Sep. 1-3, 1999, pp. 582-586.
Chen Julian C.
Kanevsky Dimitri
Zadrozny Wlodek W.
Albertalli Brian
DeRosa Frank V.
Dorvil Richemond
F. Chau & Associates LLC
LandOfFree
System and method for managing a textual archive using... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with System and method for managing a textual archive using..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for managing a textual archive using... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3630511