Method, system and computer program product for duplicate...

Data processing: database and file management or data structures – Data integrity – Data cleansing – data scrubbing – and deleting duplicates

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

08055633

ABSTRACT:
A method of duplicate detection for data items in a stream of data items, the method comprising the steps of: receiving a data item from the stream of data items; applying at least two different hashing algorithms to the data item to generate hash keys that identify elements in a first bloom filter data structure having a plurality of elements; checking a state of each of the identified elements to determine if the data item is a potential duplicate, the determination depending on whether the identified elements are indicated as having been also identified for a previous data item received from the stream; and in response to the determination that the data item is a potential duplicate, checking an index of hash keys to determine if at least one of the generated hash keys exists in the index to identify the data item as an actual duplicate.

REFERENCES:
patent: 6804667 (2004-10-01), Martin
patent: 6988124 (2006-01-01), Douceur et al.
patent: 2003/0037022 (2003-02-01), Adya et al.
patent: 2008/0154852 (2008-06-01), Beyer et al.
Bloom, Space/Time Trade-offs in Hash Coding with Allowable Errors, Jul. 1970, ACM, vol. 13 No. 7, pp. 422-427.
Deng, Approximately Detecting Duplicates for Streaming Data using Stable Bloom Filters, Jun. 29, 2006, SIGMOD 2006, pp. 25-36.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method, system and computer program product for duplicate... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method, system and computer program product for duplicate..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method, system and computer program product for duplicate... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-4262807

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.