System and method for managing a textual archive using...

Data processing: speech signal processing – linguistics – language – Linguistics – Natural language

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S270000, C382S181000, C382S186000

Reexamination Certificate

active

07092870

ABSTRACT:
A system and method for indexing and searching textual archives using semantic units such as syllables and morphemes. In one aspect, a system for indexing a textual archive comprises an AHR (automatic handwriting recognition) system and/or OCR (optical character recognition) system for transcribing (decoding) textual input data (handwritten or typed text) into a string of semantic units (e.g., syllables or morphemes) using a statistical language model and vocabulary based on semantic units (such as syllables or morphemes). The string of semantic units that result from a decoding process are stored in a semantic unit database and indexed with pointers to the corresponding textual data in the textual archive. In another aspect, a system for searching a textual archive is provided, wherein a word (or words) to be searched is rendered into a string of semantic units (e.g., syllables or morphemes) depending on the application. A search engine then compares the string of semantic units (resulting from the input query) against the decoded semantic unit database, and then identifies textual data stored in the textual archive using the indexes that were generated during a semantic unit-based indexing process.

REFERENCES:
patent: 4674066 (1987-06-01), Kucera
patent: 5268840 (1993-12-01), Chang et al.
patent: 5319745 (1994-06-01), Vinsonneau et al.
patent: 5577135 (1996-11-01), Grajski et al.
patent: 5778361 (1998-07-01), Nanjo et al.
patent: 5805747 (1998-09-01), Bradford
patent: 5832478 (1998-11-01), George
patent: 5857099 (1999-01-01), Mitchell et al.
patent: 5933525 (1999-08-01), Makhoul et al.
patent: 5953451 (1999-09-01), Syeda-Mahmood
patent: 5960447 (1999-09-01), Holt et al.
patent: 5963893 (1999-10-01), Halstead et al.
patent: 6374210 (2002-04-01), Chu
patent: 6470334 (2002-10-01), Umemoto
patent: 6879951 (2005-04-01), Kuo
patent: 2003/0200211 (2003-10-01), Tada et al.
Nguyen, Y., Vines, P., Wilkinson, R., A Comparison of Morpheme and Word Based Document Retrieval for Asian Languages, Database and Expert Systems Applications, 1996. Procedings., Seventh International Conference on., pp. 291-296.
Hackett et al., “Comparison of word-based and syllable-based retrieval for Tibetan”, Proceedings of the fifth international workshop on on Information retrieval with Asian languages, Nov. 2000, pp. 197-198.
Hahn, et al., “A study on utilizing OCR technology in building text database”, Tenth International Workshop on Database and Expert Systems Applications, Sep. 1-3, 1999, pp. 582-586.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and method for managing a textual archive using... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System and method for managing a textual archive using..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for managing a textual archive using... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3630511

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.