Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
1999-03-31
2002-04-16
Corrielus, Jean M. (Department: 2172)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000, C707S793000, C707S793000
Reexamination Certificate
active
06374241
ABSTRACT:
BACKGROUND OF THE INVENTION
This application generally relates to data processing techniques as performed in computer systems, and more specifically to data processing techniques for data integration and updates of databases in computer systems.
Generally, in a computer system, databases and other storehouses of information may need to be updated. At the database record level, such updates may be translated into one of three operations: inserting a new record, deleting an existing record, or updating an existing record. A general problem arises as to techniques for determining which records are subject to which operations. In determining which operations to perform, a determination generally must be made as to which records are considered as matching or equivalent. One technique may consider two entries as “matching” if there is an exact character match of a record included in the update list, as well as one in the database. For example, an exact match of a name, address and phone number may indicate a matching entry. Problems with this technique are that two records may in fact represent the same information or logical entity and should be considered as “matching”. However, there may be typographical errors or other semantic equivalents of information stored in the records which result in a matching failure when a character-by-character comparison, as just described, is performed. For example, a middle initial may be omitted from a person's name in one entry. In another update entry, the middle initial may be included. Although these may technically match and identify the same person, a character-by-character comparison would fail to identify these as matching records.
Another problem when considering which records are equivalent relates to the fact that update data may come from different sources. For example, if an existing record and the update records have the same source, a common set of unique identifiers may distinguish each record and used to detect matching entries. However, when the source of the existing database and the update records differ, special matching techniques are required to determine equivalent records between an existing database and update records.
Thus there is required a technique which efficiently updates an existing database by using various techniques to determine semantic equivalents of various record entries which should be considered as matching. Further, various data processing techniques are needed to “clean-up” data to be integrated into an existing database by eliminating these duplicates and incorporating semantic equivalents as appropriate.
SUMMARY OF THE INVENTION
In accordance with principles of the invention is a method executed in a computer system for determining if a data update entry has a matching entry in an existing database. It is determined if an update entry includes a phone number that is toll-free. If the update entry includes a phone number that is toll-free, the method further includes: determining a subset of one or more existing entries in the existing database with a matching phone number; for each existing entry in the subset, calculating an associated score in accordance with the strength of the name match between the update entry and each existing entry; for each existing entry in the subset, updating the associated score if a zip code match between each existing entry and the update entry is determined; determining if there is at least one associated score greater than a predetermined threshold; and if there is only one existing entry in the subset with an associated score greater than the predetermined threshold, determining this existing entry matches the update entry.
Thus, there is provided a technique which efficiently updates an existing database by using various techniques to determine semantic equivalents of various record entries which should be considered as matching.
REFERENCES:
patent: 5274802 (1993-12-01), Altine
patent: 5398335 (1995-03-01), Lewis
patent: 5412566 (1995-05-01), Sawa
patent: 5717924 (1998-02-01), Kawai
patent: 5819092 (1998-10-01), Ferguson et al.
patent: 5819291 (1998-10-01), Haimowitz et al.
patent: 5950198 (1999-09-01), Falls et al.
patent: 5960430 (1999-09-01), Haimowitz et al.
patent: 6182083 (2001-01-01), Scheifler et al.
Koyfman Lazar
Lamburt Leonid
Ponte Jay
Corrielus Jean M.
Suchyta Leonard Charles
Verizon Laboratories Inc.
Weixel James K.
LandOfFree
Data merging techniques does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Data merging techniques, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Data merging techniques will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2919939