Method for providing database recovery across multiple nodes

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C707S793000, C707S793000, C707S793000, C707S793000

Reexamination Certificate

active

06247023

ABSTRACT:

FIELD OF THE INVENTION
This invention relates to a parallel data processing environment and more particularly to an efficient mechanism for database recovery across multiple nodes after processor and/or system failure.
BACKGROUND OF THE INVENTION
In a stand-alone system, system failure recovery, also known as crash recovery, usually consists of two standard processing phases: a forward redo phase and a backward undo phase, shown representatively at FIG.
1
. Log files, in which are recorded all operations which result in changes to the database state, are replayed to recreate the event sequence for each transaction. The log files for stand-alone systems are stored on local disks, while multi-node systems may have log files stored locally (i.e., where generated) or at a central location. If a commit log record associated with a transaction is found, then the transaction is committed. If no record is found, the transaction is aborted.
The two-step recovery process phases are commonly referred to as forward recovery and backward recovery. In the forward recovery phase (steps
101
-
103
) of
FIG. 1
, the node scans the log files forward from a point determined by the checkpoint records, at
101
, and redoes all operations stored in the local log files (also referred to as the “repeat history”) to establish the state of the database right before the system crashed. To redo the operations, the node reapplies the log to the database and refreshes the transaction table, at
102
. Once a check, at
103
, determines that there are no more logs to process, the backward recovery phase (steps
104
-
108
) is conducted. In the backward recovery phase, all interrupted transactions (a.k.a., “in-flight” transactions) are rolled back (i.e., aborted). A list of all interrupted transactions is obtained at step
104
. If the list is empty, as determined at step
105
, crash recovery is complete. If the list is not empty, the node scans the logs backward and undoes (i.e., aborts) the interrupted transactions at
106
, and then updates the list at
107
. The procedure is repeated until the list is empty and the crash recovery is done, as indicated at
108
.
In a stand-alone system, the database will become consistent after these two phases of recovery. In a parallel system, however, node failures or other types of severe errors which may occur during commit processing will cause transactions to be out-of-sync across multiple nodes. Recovery across the multiple nodes is not as straight-forward as it is in a stand-alone system. Although the standard recovery process for multi-node systems does involve each node independently executing the two-step process, database consistency cannot be guaranteed across nodes, due to the nature of the commit protocol.
In what is referred to herein as the “standard two-part commit protocol,” a coordinating node, at which a transaction is executing, first issues a “prepare to commit” message to all participating, or subordinate, nodes. After receipt of responses from all participating nodes, the coordinating node then issues an outcome message in the second phase of the protocol, either a “commit” message if all nodes have sent affirmative responses, or an “abort” message. All participating nodes and the coordinating node must vote “yes” for the coordinating node to commit/complete the transaction. Any “no” response received will result in the aborting of the transaction. In response to the outcome message (“commit” or “abort”) generated by the coordinating node, all participating nodes perform local commit procedures or the transaction is aborted. Before issuing a “yes” reply to the coordinating node, each participating node writes a “prepare” log to its local disk. Similarly, before sending the “commit” message to all participating nodes, the coordinating node writes a “commit” log to its local disk. Finally, after local commit processing has been completed, a participating node writes a “commit” log to its local disk and acknowledges the commit transaction completion to the coordinating node. In addition, a transaction table entry for the corresponding transaction is updated at each local node after voting or performing a commit procedure. When the coordinating node receives an acknowledgement from all participating nodes, it removes the corresponding entry from the transaction table, writes a “forget” log record to disk, and “forgets” about the transaction.
For aborted transactions, typically, the protocol will not require that each participating node generated an acknowledgement message to the coordinating node, although such can readily be implemented. Before a forget log is written at the coordinating node, a transaction can be in the committed state, but not yet in the forgotten state. Similarly, a participating node can have prepared to commit, and yet not received the outcome message from the coordinating node. If a crash occurs before the transactions are resolved, the interrupted transactions cannot readily be traced and replayed under the prior art two-phase recovery procedure. Moreover, the transaction may have been committed at one node, and not at another, resulting in database inconsistency across the nodes. What is needed is a process by which a given transaction can be traced to the point of interruption, and also may be “resurrected” for completion.
It is therefore an objective of the invention to provide an improved crash recovery mechanism for database recovery across multiple nodes.
It is another objective of the invention to provide crash recovery which can effectively identify and resolve interrupted transactions.
Yet another objective of the invention is to provide a mechanism by which the database can be accessed before completion of the crash recovery process.
SUMMARY OF THE INVENTION
These and other objectives are realized by the present invention wherein a three phase multi-node crash recovery process is implemented including a forward phase, a backward phase, and a third (so-called “sideward”) phase for recovery of transactions which were interrupted at the time of the crash. The novel method uses Global Transaction IDs to track the status of transactions at the coordinating node and/or at one or more of the participating nodes. Depending upon the status of the transaction at the time of the crash, a crash recovery agent at a designated node generates either a query message to the coordinating node or a vote message to at least one participating node. The node which receives the message acts as the sideward recovery agent to process the message and respond to the crash recovery agent, thereby allowing most interrupted transactions to be completed. Additional implementations are provided for crash recovery, without duplication of efforts, across multiple nodes in a parallel database environment, for cascaded transactions wherein the database recovery at a local node is triggered by database recovery at a remote node in the parallel system, and for concurrent recovery, wherein database recovery is started concurrently at both transaction coordinator and participant nodes.


REFERENCES:
patent: 5201044 (1993-04-01), Frey, Jr. et al.
patent: 5335343 (1994-08-01), Lampson et al.
patent: 5432926 (1995-07-01), Citron et al.
patent: 5799305 (1998-08-01), Bortvedt et al.
patent: 6012094 (2000-06-01), Leymann et al.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method for providing database recovery across multiple nodes does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method for providing database recovery across multiple nodes, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for providing database recovery across multiple nodes will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2457599

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.