Document retrieval using index of reduced size

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C707S793000

Reexamination Certificate

active

07072889

ABSTRACT:
A document retrieval apparatus for retrieving a document including a query character string among a plurality of registered documents includes a text separating unit which separates the registered documents and a query character string into n-grams and words, an n-gram index which stores therein information about occurrences of n-grams appearing in the registered documents on a n-gram-specific basis, a word-boundary-position index which stores therein information about occurrences of word boundaries appearing in the registered documents in a compressed form, a character-string-based search unit which identifies one or more registered documents including the query character string by looking up one or more n-grams of the query character string in the n-gram index, and a word-based search unit which checks whether the query character string appears as word in the one or more identified registered documents by looking up one or more words of the query character string in the word-boundary-position index, thereby identifying a registered document including the query character string as word.

REFERENCES:
patent: 5020019 (1991-05-01), Ogawa
patent: 5535382 (1996-07-01), Ogawa
patent: 6006221 (1999-12-01), Liddy et al.
patent: 6246791 (2001-06-01), Kurzweil et al.
patent: 6546383 (2003-04-01), Ogawa
patent: 6546401 (2003-04-01), Iizuka et al.
patent: 6714927 (2004-03-01), Ogawa
patent: 2003/0200211 (2003-10-01), Tada et al.
patent: 7-85033 (1995-03-01), None
patent: 2000-67070 (2000-03-01), None
patent: 2000-231563 (2000-08-01), None
patent: 2000-348059 (2000-12-01), None
Joon Ho Lee and Joong Soo Ahn, “Using n-Grams for Korean Text Retrieval”, 1996, SIGIR Forum (USA), ACM Inc., pp. 216-224.
Joon H. Lee, et al., “Using n-Grams for Korean Text Retrievel”, Korea Research and Development Information Center, pp. 216-224.
Ogawa et al., An Efficient Document Retrieval Method Using N-Gram Indexing, Systems & Computers in Japan, Wiley, Hoboken, New Jersey; vol. 33, No. 2, Feb. 2002, pp. 54-63.
Ogawa et al., Overlapping Statistical Segmentation for Effective Indexing of Japanese Text, Information Processing & Management, Elsevier, Barking, Great Britain, vol., 35, No. 4, Jul. 1999, pp. 463-480.
Tehan et al., A Compression-Based Algorithm for Chinese Word Segmentation, Computational Linguistics, Online, vol. 26, No. 3, pp. 375-393.
Zobel et al., Efficient Retrieval of Partial Documents, Information Processing & Management, Elsevier, Barking, Great Britain, vol. 31, No. 3, May 1995, pp. 361-377.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Document retrieval using index of reduced size does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Document retrieval using index of reduced size, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Document retrieval using index of reduced size will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3613854

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.