System and method of synchronizing replicated data

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C707S793000, C707S793000

Reexamination Certificate

active

06668260

ABSTRACT:

FIELD
The present invention has emerged from the field of synchronizing databases.
BACKGROUND
The present invention relates generally to computing systems in which there are kept a number of replicated databases, and in particular to a method for comparing the databases quickly and efficiently.
In a clustered computing environment, as well as other environments, it is required to provide each node with information concerning the cluster (e.g., the location of processor units, peripheral units, etc.), its use, its users, and the like. Often kept in a database of one sort or another, the amount of this information can be quite large. This leads to problems when the databases of each node need to be checked, such as when a periodic check needs to be made to ensure the integrity of the database and the information it contains, or to ensure that changes to the database were made correctly. Such checks, however, can be very time consuming, and tend to impose a significant burden on system resources, particularly if such checks are frequently required. If the checks require communication between two nodes across a communication path, the amount of communication can be significant and create a bottleneck.
Under certain conditions, it is desirable to store copies of a particular body of data, such as a relational database table, at multiple sites in a distributed compute network. If users are allowed to update the body of data at one site, the updates must be propagated to the copies at the other sites in order for the copies to remain consistent. The process of propagating the changes is generally referred to as replication. Various mechanisms have been developed for performing replication.
The table at which a change is initially made to a set of replicated data is referred to herein as the master table. A table to which the change must be propagated is referred to herein as a satellite table. Replication does not require an entire transaction executed at a master table to be re-executed at each of the satellite tables. Only the net changes made by the transaction need to be propagated. Other types of operations, such as read and sort operations, that may have been executed in the original transaction do not have to be re-executed at the satellite tables.
There are two basic approaches to replication: synchronous replication and asynchronous replication. In synchronous replication, each update or modification of a body of data is immediately replicated to all other replicas or copies of the body of data within the distributed network, typically by techniques such as a two-phase commit. The transaction that modifies the body of data is not allowed to complete until all other replicas have been similarly updated. Although synchronous replication provides a straightforward methodology for maintaining data consistency in a network this method is susceptible to network latencies and intermittent network failures. Furthermore, synchronous replication cannot prioritize updates; accordingly, low priority updates can unnecessarily produce significant system delays.
On the other hand, in asynchronous replication, local replicas of a particular data structure are allowed to be slightly different for a time until an asynchronous update is performed. During asynchronous replication, a master table can be modified without forcing a network access as in synchronous replication methodology. At some later point time in time, the modification is propagated to the satellite tables. Various techniques for asynchronous propagation have been developed for example, remote procedure calls (RPCs) and deferred transaction queues.
In asynchronous replication, conflicts in updating a body of data might occur if two sites concurrently modify the same data item before the data modification can be propagated to other sites. If update conflicts are not first detected and then handled in some convergent manner, the data integrity of the replicated copies will begin to diverge.
Database systems often locally replicate remote tables that are frequently queried by local users. By having local copies of heavily accessed data on several nodes, the database does not need to send information across the network every time a transaction on any of the several nodes requires access to the data. Thus, the use of local copies of data improves the performance of the requesting node and reduces the amount of inter-node traffic.
The copies of data stored at replicated sites may diverge from the data at the original or “base” site for any number of reasons. For example, software problems or conflict resolution issues may cause a database to replicate data incorrectly. To determine whether discrepancies exist between different copies of the same data, it would be beneficial to have a mechanism for comparing the replicated data to the corresponding data in the base site. Once the discrepancies are identified, they can be rectified.
The prior art have solved many of the problems apparent from the above discussion. However, still present has been the problem that replication of database information is performance intensive and time consuming, absorbing computing and communication resources from other computing and networking tasks.
In view of the foregoing, it would be highly desirable to make available a method of replicating data that is both quick and relatively undemanding of computing, and especially networking, resources.
SUMMARY
Disclosed is a replication method. The method includes generating an identifier column for a master table, the master table including a key column and an identifier column. Also included is copying the master table to a satellite table, so that the satellite table is a replica of the master table. Further included is associating an insert trigger with the master table. Another inclusion relates to assigning a first identifier value to the identifier column of an inserted row, the assigning caused by the insert trigger and occurring responsively to inserting a row into the master table. Additionally included is allowing inserts to be made to the master table. Included is synchronizing the satellite table to the master table. Synchronizing includes comparing the master table key and identifier columns with the satellite table key and identifier columns. Synchronizing also includes producing a row set of rows based on the initial comparing of synchronizing, the rows being those rows present in the master table but not in the satellite table. Another synchronizing inclusion is comparing the master table key and identifier columns with the satellite table key and identifier columns. Additionally, synchronizing includes deleting the rows that are present in the satellite table but not in the master table, as determined based on the second synchronizing comparing. Synchronizing further includes inserting the row set of rows into the satellite table.
Disclosed is the replication method, further including associating an update trigger with the master table, assigning a second identifier value to the identifier column of the updated row, the assigning caused by the update trigger and occurring responsively to updating a row of the master table, and allowing updates to be made to the master table.
Further disclosed is the possibility of the identifier column including a checksum column or a row version number column. The checksum would be calculated based contents of the row. The row version number would simply be incremented. Likewise disclosed is the possibility of the identifier column including a row version number column. The identifier value is a row version number. The row version number is updated by incrementation.
Also disclosed is a configuration wherein the master table is in a first database, the first database residing on a first computing device and wherein the satellite table is in a second database, the second database residing on a second computing device. The first computing device is communicably coupled to the second computing device. Comparing may be a distributed query.
Moreover, if the ident

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and method of synchronizing replicated data does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System and method of synchronizing replicated data, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method of synchronizing replicated data will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3127592

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.