Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
1999-03-31
2002-05-28
Shah, Sanjiv (Department: 2172)
Data processing: database and file management or data structures
Database design
Data structure types
C200S201000
Reexamination Certificate
active
06397228
ABSTRACT:
BACKGROUND OF THE INVENTION
This application generally relates to data processing techniques as performed in computer systems, and more specifically to data processing techniques for data integration and updates of databases in computer systems.
Generally, in a computer system, databases and other storehouses of information may need to be updated. At the database record level, such updates may be translated into one of three operations: inserting a new record, deleting an existing record, or updating an existing record. A general problem arises as to techniques for determining which records are subject to which operations. In determining which operations to perform, a determination generally must be made as to which records are considered as matching or equivalent. One technique may consider two entries as “matching” if there is an exact character match of a record included in the update list, as well as one in the database. For example, an exact match of a name, address and phone number may indicate a matching entry. Problems with this technique are that two records may in fact represent the same information or logical entity and should be considered as “matching”. However, there may be typographical errors or other semantic equivalents of information stored in the records which result in a matching failure when a character-by-character comparison, as just described, is performed. For example, a middle initial may be omitted from a person's name in one entry. In another update entry, the middle initial may be included. Although these may technically match and identify the same person, a character-by- character comparison would fail to identify these as matching records.
Another problem when considering which records are equivalent relates to the fact that update data may come from different sources. For example, if an existing record and the update records have the same source, a common set of unique identifiers may distinguish each record and used to detect matching entries. However, when the source of the existing database and the update records differ, special matching techniques are required to determine equivalent records between an existing database and update records.
Thus there is required a technique which efficiently updates an existing database by using various techniques to determine semantic equivalents of various record entries which should be considered as matching. Further, various data processing techniques are needed to “clean-up” data to be integrated into an existing database by eliminating these duplicates and incorporating semantic equivalents as appropriate.
SUMMARY OF THE INVENTION
In accordance with principles of the invention is a method executed in a computer system for performing data integration. For each update record, a determination is made regarding a transaction classification with regard to a working database. Transactions are applied to an unfiltered version of the working database in which the unfiltered database includes one or more records having unfiltered data. For each of said transactions, data enhancements are performed to an update record corresponding to each transaction producing a filtered record if the update record corresponding to each transaction is an update or an insert transaction. One or more filtered records is integrated into the working database. Post-processing is performed upon portions of the working database.
Thus, there is provided a technique which efficiently updates an existing database by using various techniques to determine semantic equivalents of various record entries which should be considered as matching, and performing data enhancements to the contents of the working database.
REFERENCES:
patent: 4003024 (1977-01-01), Riganati et al.
patent: 4365304 (1982-12-01), Ruhman et al.
patent: 5187747 (1993-02-01), Capello et al.
patent: 5802527 (1998-09-01), Brechtel et al.
patent: 6073140 (2000-06-01), Morgan et al.
Koyfman Lazar
Lamburt Leonid
Shah Sanjiv
Suchyta Leonard Charles
Verizon Laboratories Inc.
Weixel James K.
LandOfFree
Data enhancement techniques does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Data enhancement techniques, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Data enhancement techniques will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2914554