Filter for checking for duplicate entries in database

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C709S200000

Reexamination Certificate

active

06804667

ABSTRACT:

The invention concerns a method and apparatus for checking whether a new record to be added to a database is a duplicate of an existing record.
BACKGROUND OF THE INVENTION
FIG. 1
illustrates a simple table which exists in a hypothetical database of a bank. The table lists four types of information, arranged in columns: (1) the cities in which the bank branches are located, (2) the total assets, or deposits, of each branch, (3) the customers who maintain accounts at each branch, and (4) the balance of each account.
During operation of the bank, entries within a row will change. (A row is also sometimes called a “record.) For example, if fifty dollars is deposited to the WILSON account, the ACCOUNT BALANCE will be changed to $150.
An entire row may change, as when it is deleted. For example, the row “ANTIOCH,
1000
, WILSON,
100
” may be deleted when the Wilson account closes. Conversely, a row may be added when a new customer opens an account.
Some types of databases do not allow a new row to be added if the new row contains information which is identical to that contained in an existing row. For example, if a new customer named UNSER wishes to open an account at the ANTIOCH branch by depositing 75 dollars, a duplicate row would be created. However such a situation is illegal, as indicated in FIG.
2
.
The duplicate row can create several problems. For example, if the first UNSER wishes to close the account, the question arises, Which row should be deleted? As another example, an uninformed observer may view the duplicate row as a mistake, and presume it to be a duplicate of the first UNSER's data, when, in reality, it represents the account of a second UNSER.
Several approaches are available to prevent this duplication. In one approach, when a new row is to be added, all rows of the database are examined, and compared with the new row. If the examination finds that the new row matches no existing row, the new row is added.
However, this approach is time-consuming. For example, assume that a fresh database is created, and contains a single row. When a second row is added later, a single comparison is required, between the second and first row. Addition of the third row requires two comparisons. In general, the number of comparisons is proportional to the number of existing rows, as indicated in FIG.
3
.
However, the total number of comparisons performed since creation of the database is a square-law function of the number of rows, as indicated in FIG.
4
. Viewed graphically, the total number of comparisons, past and present, equals the area of the hatched triangle. The area of the triangle equals (½)×(no. of rows)**2. If one million rows are present today, then a total of 5×10**11 comparisons have been made so far, in adding a new row today.
These comparisons are time-consuming.
OBJECTS OF THE INVENTION
An object of the invention is to provide an improved database management system.
A further object of the invention is to provide an improved system for preventing duplication of rows in a database.
SUMMARY OF THE INVENTION
A Bloom Filter is generated, based on the database. When a new row is to be added, the Bloom Filter is consulted to determine whether the new row duplicates an existing row. If duplication is not found, the new row is added.


REFERENCES:
patent: 5511190 (1996-04-01), Sharma et al.
patent: 5522066 (1996-05-01), Lu
patent: 5691524 (1997-11-01), Josephson
patent: 5701464 (1997-12-01), Aucsmith
patent: 5960430 (1999-09-01), Haimowitz et al.
patent: 6073160 (2000-06-01), Grantham et al.
patent: 6134551 (2000-10-01), Aucsmith
patent: 6199068 (2001-03-01), Carpenter
patent: 6223187 (2001-04-01), Boothby et al.
patent: 6339824 (2002-01-01), Smith et al.
patent: 6363366 (2002-03-01), Henty
patent: 6374266 (2002-04-01), Shnelvar
patent: 6421666 (2002-07-01), Murthy et al.
patent: 6430496 (2002-08-01), Smith et al.
patent: 6430539 (2002-08-01), Lazarus et al.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Filter for checking for duplicate entries in database does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Filter for checking for duplicate entries in database, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Filter for checking for duplicate entries in database will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3321330

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.