Example-driven design of efficient record matching queries

Data processing: database and file management or data structures – Data integrity – Data cleansing – data scrubbing – and deleting duplicates

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S797000

Reexamination Certificate

active

08046339

ABSTRACT:
Example-driven creation of record matching queries. The disclosed architecture employs techniques that exploit the availability of positive (or matching) and negative (non-matching) examples to search through this space and suggest an initial record matching query. The record matching task is modeled as that of designing an operator tree obtained by composing a few primitive operators. This ensures that record matching programs be executable efficiently and scalably over large input relations. The architecture joins records across multiple (e.g., two) relations (e.g., R and S). The architecture exploits the monotonicity property of similarity functions for record matching in the relations, in that, any pair of matching records have a higher similarity value than non-matching record pairs on at least one similarity function.

REFERENCES:
patent: 5960430 (1999-09-01), Haimowitz et al.
patent: 6067552 (2000-05-01), Yu
patent: 6240408 (2001-05-01), Kaufman
patent: 6243713 (2001-06-01), Nelson et al.
patent: 6295533 (2001-09-01), Cohen
patent: 6804667 (2004-10-01), Martin
patent: 6839701 (2005-01-01), Baer et al.
patent: 6968332 (2005-11-01), Milic-Frayling et al.
patent: 6975766 (2005-12-01), Fukushima
patent: 7076485 (2006-07-01), Bloedorn
patent: 7152060 (2006-12-01), Borthwick et al.
patent: 7287019 (2007-10-01), Kapoor et al.
patent: 7730060 (2010-06-01), Chakrabarti et al.
patent: 2002/0147710 (2002-10-01), Hu
patent: 2003/0033288 (2003-02-01), Shanahan et al.
patent: 2004/0139072 (2004-07-01), Broder et al.
patent: 2004/0249789 (2004-12-01), Kapoor et al.
patent: 2005/0027717 (2005-02-01), Koudas et al.
patent: 2005/0216443 (2005-09-01), Morton et al.
patent: 2005/0222977 (2005-10-01), Zhou et al.
patent: 2005/0234881 (2005-10-01), Burago et al.
patent: 2006/0047691 (2006-03-01), Humphreys et al.
patent: 2006/0117003 (2006-06-01), Ortega et al.
patent: 2006/0122978 (2006-06-01), Brill et al.
patent: 2006/0161522 (2006-07-01), Dettinger et al.
patent: 2006/0282414 (2006-12-01), Sugihara et al.
Vance et al, “Rapid Bushy Join-Order Optimization iwth Cartesian Products”, Jun. 1996, Proceedings of AMC SIGMOD Conference of Management of Data, pp. 35-46.
Answers.com, Dictionary: normalize, Answers.com, http://www.answers.com/topic
ormalize.
Absolute Astronomy.com, Hyperrectangle, http://www.absoluteastronomy.com/topics/Hyperrectangle.
Shekhar et al, Encyclopedia of GIS, Springer, 2008 ISBN, http://books.google.com/books?id=6q2IOfLnwkAC&pg=PA1060&Ipg=PA1060&dp=%22skyline%22+and +hyperrectangle&source=bI&outs=0W9n5170n&sig=tAbEKGE-OTd3ImYuqYTx0BXN730&hI=en&ei=aP8nSrnQK4mEtwfX0My1Bg&sa=X&oi=book—result&ct=result&resnum=1#PPA1056,M1.
Silva et al, The Similarity Join Database Operator; Department of Computer Science, Purdue University, Indiana, USA; Microsoft Corporation, Washington, USA; Nov. 19, 2009; p. 12.
Webopedia, What is a query?, http://www.webopedia.com/TERM/q/query.html, p. 1.
Webopedia, What is a query?, http://www.webopedia.com/TERM/Q/query.html, p. 1, Aug. 8, 2002.
Answers.com, Dictonary: normalize, Answers.com, http://www.answers.com/topic
ormalize, May 23, 2005.
Absolute Astronomy.com, Hyperrectangle, http://www.absoluteastronomy.com/topics/Hyperrectangle, 2009.
Chaudhuri, et al., “Robust and Efficient Fuzzy Match for Online Data Cleaning”, SIGMOD, Jun. 9-12, 2003, ACM, 2003, pp. 313-324.
Cohen, “Integration of Heterogeneous Databases without Common Domains Using Queries Based on Textual Similarity”, SIGMOD, 1998, ACM, pp. 12.
Chaudhuri, et al., “A Primitive Operator for Similarity Joins in Data Cleaning”, IEEE, 2006, pp. 12.
Agichtein et al.; “Querying Text Databases for Efficient Information Extraction”; Columbia University, ICDE, IEEE, 2003; pp. 1-12.
Cheng et al.; “Entity Search Engine: Towards Agile Best-Effort Information Integration Over the Web”; CIDR; 2007; pp. 1-6.
Pasca; “Acquisition of Categorized Named Entities for Web Search”; Google Inc.; CIKM, ACM; Nov. 8-13, 2004; pp. 137-145.
Popov et al.; “KIM—Semantic Annotation Platform”; vol. 2870; Springer Berlin; 2003; pp. 1-16.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Example-driven design of efficient record matching queries does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Example-driven design of efficient record matching queries, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Example-driven design of efficient record matching queries will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-4274690

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.