Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
1998-05-27
2001-02-13
Black, Thomas G. (Department: 2771)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000
Reexamination Certificate
active
06189017
ABSTRACT:
TECHNICAL FIELD
The present invention relates to a method of producing back-up replicas or safety replicas of data in nodes belonging to a distributed data base, and of structuring a system of such nodes, to provide a reliable system with a theoretically very small and practically non existent risk of a total failure or crash of the system.
A data base may be distributed in a system comprising several computers which computers in mutual coaction form nodes on which information belonging to the distributed data base can be stored.
A fragment comprises a part of the data base, and includes a primary replica of the part and a secondary replica of the part. The primary replica is stored within a first node and the secondary replica is stored within a second node that is separated from the first node.
Respective nodes in the system includes one or more primary and/or secondary replicas of different fragments.
Some transactions within the data base result in a change in one or more fragments. These transactions are so called changing transactions. Such changes are performed in the primary replica and the secondary replica is updated according to the change as a part of the changing transaction.
Both data information and log information is stored within both the primary replica and the secondary replica.
If the first node crashes then the primary replica is re-created from information available in the secondary replica.
PRIOR ART RELATED TO THE INVENTION
It is previously known to use back-up copying or safety copying in various computer systems in order to get parallel redundant systems, such as a primary system and a secondary system. It is also know that that the loss of data in relation to a crash of the primary system depends of how often the secondary system is updated.
Everything that has been performed during the time between a crash of the primary system and the latest update of the secondary system is lost when the primary system crashes.
In data base applications it is known that several operators or users can make use of the content of the data base through so called “transactions”, where some transaction generates changes in the content of the data base, or changes in the structure of the data base, so called “schema changes”.
In this context it is known store to two different kinds of information in both the primary and the secondary system.
A first kind of information is the actual content of the data base, which in this description is called “data information”. A change in the content results in a change in both the primary and the secondary data base.
Further information that is stored relates to the transactions and to the schema changes that have been performed. This information is called “log information” and is stored within a so called “log”. The log information of transactions and schema changes is also stored in a primary and secondary fashion.
A distributed data base comprises several nodes which together constitute a mutual system with a mutual data base. The information within a distributed data base is distributed over the various nodes belonging to the data base.
One node can hold a primary replica of one or several parts of the data base and a secondary replica of one or several parts of the data base. A primary replica and an associated secondary replica is called here a fragment.
As examples of publications describing back-up copying in systems concerning distributed data bases the following publications can be mentioned: U.S. Pat. No. 5,404,508 and U.S. Pat. No. 5,555,404.
The present invention can be regarded as being based on a data base described in publication U.S. Pat. No. 5,423,037.
This publication describes a data base built upon a so called “shared nothing” system, meaning that every node within the system is completely independent from the other nodes and shares nothing that has anything to do with managing data, such as processors, memories or other data structures.
The publication teaches specifically that no memories can be shared between the different nodes.
The nodes are divided into at least two groups, one first and one second group, where the nodes in the different groups does not share any parts, such as power supply units and cooling fans.
The data base is divided into fragments and each fragment comprises one primary replica and at least one stand-by replica, which is essentially a copy of the primary replica, meaning that it comprises essentially the same information as the primary replica. The primary replica and the stand-by replica are stored in nodes that belong to mutually different groups of nodes.
Several stand-by replicas can be used to obtain a greater system reliability and if that is done then each stand-by replica comprises essentially the same information as the primary replica and they are all stored in nodes that belong to mutually different groups of nodes.
The groups of nodes are preferably symmetrical sets of nodes on different sides of the system's “mirror dimension”.
The records in the stand-by replicas are kept up to date by sending all log records produced by transactions from the node with the primary replica to the node with the corresponding stand-by replica. The serialized log records are read and the corresponding table records updated at the node of the stand-by replica as an ongoing activity.
It shall also be mentioned that it is previously known to use various transaction protocols as information is transferred from one node to another.
Usually so called one-safe transmissions or two-safe transmissions are used.
In a simplified manner it can be said that in a one-safe transmission the log information related to the transaction is transferred from a first node to a second node. This transfer is performed at a moment when capacity to do so is available at the first node, which might be up to a few seconds after the transaction has been requested.
The second node uses the received log information from the transaction to update both data end log information within itself.
A one-safe transmission provides short response times to the application that has requested the transaction. On the other hand it also means that that second node is not always updated and that some transactions might be lost should the first node crash.
Put simply both the first and the second node take a part in the actual transaction in a two-safe transmission.
A request to prepare is sent at the start of a transaction, which includes a query of asking whether a certain transaction can be performed. Affected nodes reply to the request with “yes” or “no”. A yes means that the nodes commit themselves to carry out the transaction if a decision is made to do so. If all affected nodes reply with yes, then the transaction is performed whereas if any node replies no then the transaction is aborted.
The second node partakes as one of the affected nodes in a two-safe transmission, meaning that it has to reply to whether it can commit to perform the transaction or not. The second node is thus updated as a part of the actual transaction, if the transaction is performed, meaning that the second node is always updated.
A two-safe transmission provides greater reliability than a one-safe transmission, since the second node almost always contains the same information as the first node. On the other hand, in the case of two-safe transmission the time to respond to the application that has requested the transaction may be longer than in the case of a one-safe transmission. A two-safe transmission also requires the transmission of more messages between affected nodes, and thus a higher transmission capacity than what is required in a one-safe transmission.
There is nothing that prevents the use of both one-safe and two-safe transmissions within the same system for different transactions and/or applications.
When a node crashes, the information stored in replicas within that node will be lost. These replicas might be primary replicas of some fragments and secondary replicas of other fragments. The loss of a primary replica means that a new primary
Malik Shahid Mahmood
Ronstrom Ulf Mikael
Black Thomas G.
Burns Doane , Swecker, Mathis LLP
Coby Frantz
Telefonaktiebolaget LM Ericsson
LandOfFree
Method to be used with a distributed data base, and a system... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method to be used with a distributed data base, and a system..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method to be used with a distributed data base, and a system... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2613774