Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
2006-07-04
2006-07-04
Robinson, Greta (Department: 2167)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000, C707S793000
Reexamination Certificate
active
07072889
ABSTRACT:
A document retrieval apparatus for retrieving a document including a query character string among a plurality of registered documents includes a text separating unit which separates the registered documents and a query character string into n-grams and words, an n-gram index which stores therein information about occurrences of n-grams appearing in the registered documents on a n-gram-specific basis, a word-boundary-position index which stores therein information about occurrences of word boundaries appearing in the registered documents in a compressed form, a character-string-based search unit which identifies one or more registered documents including the query character string by looking up one or more n-grams of the query character string in the n-gram index, and a word-based search unit which checks whether the query character string appears as word in the one or more identified registered documents by looking up one or more words of the query character string in the word-boundary-position index, thereby identifying a registered document including the query character string as word.
REFERENCES:
patent: 5020019 (1991-05-01), Ogawa
patent: 5535382 (1996-07-01), Ogawa
patent: 6006221 (1999-12-01), Liddy et al.
patent: 6246791 (2001-06-01), Kurzweil et al.
patent: 6546383 (2003-04-01), Ogawa
patent: 6546401 (2003-04-01), Iizuka et al.
patent: 6714927 (2004-03-01), Ogawa
patent: 2003/0200211 (2003-10-01), Tada et al.
patent: 7-85033 (1995-03-01), None
patent: 2000-67070 (2000-03-01), None
patent: 2000-231563 (2000-08-01), None
patent: 2000-348059 (2000-12-01), None
Joon Ho Lee and Joong Soo Ahn, “Using n-Grams for Korean Text Retrieval”, 1996, SIGIR Forum (USA), ACM Inc., pp. 216-224.
Joon H. Lee, et al., “Using n-Grams for Korean Text Retrievel”, Korea Research and Development Information Center, pp. 216-224.
Ogawa et al., An Efficient Document Retrieval Method Using N-Gram Indexing, Systems & Computers in Japan, Wiley, Hoboken, New Jersey; vol. 33, No. 2, Feb. 2002, pp. 54-63.
Ogawa et al., Overlapping Statistical Segmentation for Effective Indexing of Japanese Text, Information Processing & Management, Elsevier, Barking, Great Britain, vol., 35, No. 4, Jul. 1999, pp. 463-480.
Tehan et al., A Compression-Based Algorithm for Chinese Word Segmentation, Computational Linguistics, Online, vol. 26, No. 3, pp. 375-393.
Zobel et al., Efficient Retrieval of Partial Documents, Information Processing & Management, Elsevier, Barking, Great Britain, vol. 31, No. 3, May 1995, pp. 361-377.
Dickstein , Shapiro, Morin & Oshinsky, LLP
Dodds, Jr. Harold E.
Ricoh & Company, Ltd.
Robinson Greta
LandOfFree
Document retrieval using index of reduced size does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Document retrieval using index of reduced size, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Document retrieval using index of reduced size will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3613854