Method of merging large databases in parallel

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

395607, 395795, 395761, G06F 706, G06F 720, G06F 714

Patent

active

057179158

ABSTRACT:
The semantic integration problem for merging multiple databases of very large size, the merge/purge problem, can be solved by multiple runs of the sorted neighborhood method or the clustering method with small windows followed by the computation of the transitive closure over the results of each run. The sorted neighborhood method works well under this scheme but is computationally expensive due to the sorting phase. An alternative method based on data clustering that reduces the complexity to linear time making multiple runs followed by transitive closure feasible and efficient. A method is provided for identifying duplicate records in a database, each record having at least one field and a plurality of keys, including the steps of sorting the records according to a criteria applied to a first key; comparing a number of consecutive sorted records to each other, wherein the number is less than a number of records in said database and identifying a first group of duplicate records; storing the identity of the first group; sorting the records according to a criteria applied to a second key; comparing a number of consecutive sorted records to each other, wherein the number is less than a number of records in said database and identifying a second group of duplicate records; storing the identity of the second group; and subjecting the union of the first and second groups to transitive closure.

REFERENCES:
patent: 4209845 (1980-06-01), Berger et al.
patent: 4930072 (1990-05-01), Agrawal et al.
patent: 5111395 (1992-05-01), Smith et al.
patent: 5142687 (1992-08-01), Lary
patent: 5146590 (1992-09-01), Lorie et al.
patent: 5193207 (1993-03-01), Vander Vegt et al.
patent: 5303149 (1994-04-01), Janigian
patent: 5307485 (1994-04-01), Bordonaro et al.
patent: 5319739 (1994-06-01), Yoshiura et al.
patent: 5349684 (1994-09-01), Edem et al.
patent: 5355481 (1994-10-01), Sluijter
patent: 5497486 (1996-03-01), Stolfo et al.
patent: 5537604 (1996-07-01), Baum et al.
patent: 5537622 (1996-07-01), Baum et al.
patent: 5542087 (1996-07-01), Neimat et al.
patent: 5548769 (1996-08-01), Baum et al.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method of merging large databases in parallel does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method of merging large databases in parallel, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method of merging large databases in parallel will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2086609

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.