Selection of a set of optimal n-grams for indexing string...

Data processing: database and file management or data structures – Database and file access – Preparing data for information retrieval

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S758000

Reexamination Certificate

active

08001128

ABSTRACT:
The present invention provides a computer-readable medium and system for selecting a set of n-grams for indexing string data in a DBMS system. Aspects of the invention include providing a set of candidate n-grams, each n-gram comprising a sequence of characters; identifying sample queries having character strings containing the candidate n-grams; and based on the set of candidate n-grams, the sample queries, database records, and an n-gram space constraint, automatically selecting, given the space constraint, a minimal set of an n-grams from the set of candidate n-grams that minimizes the number of false hits for the set of sample queries had the sample queries been executed against the database records.

REFERENCES:
patent: 5418951 (1995-05-01), Damashek
patent: 5640487 (1997-06-01), Lau et al.
patent: 5706365 (1998-01-01), Rangarajan et al.
patent: 5752051 (1998-05-01), Cohen
patent: 5991714 (1999-11-01), Shaner
patent: 6005495 (1999-12-01), Connolly et al.
patent: 6204848 (2001-03-01), Nowlan et al.
patent: 6473754 (2002-10-01), Matsubayashi et al.
patent: 6618697 (2003-09-01), Kantrowitz et al.
patent: 6654734 (2003-11-01), Mani et al.
patent: 6775666 (2004-08-01), Stumpf et al.
patent: 7010522 (2006-03-01), Jagadish et al.
patent: 7149735 (2006-12-01), Chaudhuri et al.
patent: 7478081 (2009-01-01), Hacigumus et al.
patent: 2002/0099536 (2002-07-01), Bordner et al.
patent: 2002/0165873 (2002-11-01), Kwok et al.
patent: 2004/0042667 (2004-03-01), Lee et al.
patent: 2004/0044952 (2004-03-01), Jiang et al.
patent: 2004/0260543 (2004-12-01), Horowitz et al.
patent: 2005/0209844 (2005-09-01), Wu et al.
patent: 2005/0210383 (2005-09-01), Cucerzan et al.
patent: 2005/0226512 (2005-10-01), Napper
patent: 2006/0101000 (2006-05-01), Hacigumus et al.
patent: 2008/0281857 (2008-11-01), Dymetman
“Useful English Language Statistics,” http://www-math.cudenver.edu/˜wcherowi/courses/m5410/engstat.html, 3 pages.
“Computer Science Bibliography,” http://dblp.uni-trier.de/, 3 pages.
Chen, Zhiyuan et al., “Query Optimization in Compressed Database Systems,” ACM SIGMOD May 21-24, 2001, Santa Barbara, California, pp. 271-280.
Cho, Junghoo et al., “A Fast Regular Expression Indexing Engine,” 2002, pp. 1-12.
Gravano, Luis et al., “Approximate String Joins in a Database (Almost) for Free,” Proceedings of the 27th VLDB Conference, Roma, Italy, 2001, 10 pages.
Gravano, Luis et al.,“Using q-grams in a DBMS for Approximate String Processing,”Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, 2001, pp. 1-7.
Hochbaum, Dorit S., “Analysis of the Greedy Approach in Problems of Maximum k-Coverage,” Department of Industrial Engineering and Operations . . . Mar. 17, 1997, pp. 1-14.
Navarro, Gonzalo, “A Guided Tour to Approximate String Matching,” ACM Computing Surveys, vol. 33, No. 1, Mar. 2001, pp. 31-88.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Selection of a set of optimal n-grams for indexing string... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Selection of a set of optimal n-grams for indexing string..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Selection of a set of optimal n-grams for indexing string... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2661952

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.