Electrical computers and digital processing systems: support – Multiple computer communication using cryptography – Protection at a particular protocol layer
Reexamination Certificate
1999-02-23
2002-10-08
Wiley, David (Department: 2158)
Electrical computers and digital processing systems: support
Multiple computer communication using cryptography
Protection at a particular protocol layer
C709S224000
Reexamination Certificate
active
06463532
ABSTRACT:
BACKGROUND OF THE INVENTION
The present invention relates, in general, to the field of systems and methods for dynamic information storage or retrieval. More generally, the present invention relates to a system and method for effectuating distributed consensus utilizing shared storage resources and state coordination among members of a processor set in a multiprocessor environment.
A computer system generally includes at least one processor to perform computations and control components coupled to the computer system. Some computer systems include multiple processors that access shared resources within the computer system. For example, multiple processors may individually access a shared hard disk, a shared input/output (“I/O”) channel, a shared peripheral or a shared memory space to perform a particular function. Furthermore, such multiprocessor systems may allow a processor to communicate with other processors within the computer system through access to shared resources. For example, it is common for a processor to store data intended for another processor in a shared memory location. Thereafter, the other processor can read the data from the shared memory location.
It is also common for multiple processors in a computer system to share a storage location, for example, in a database stored on a hard disk. Preferably, access to the shared storage location is coordinated to provide exclusive access to the shared storage location by any single processor. Otherwise, one processor may independently modify the contents of the shared storage location without notice to another processor accessing the shared storage location at approximately the same time. Such processors are termed “competing processors”, in that they are competing for access to a shared storage location. A possible result of non-exclusive access to a shared storage location by competing processors is that corrupted or unintended data may be read or stored in the shared storage location by one of the processors.
The aforementioned co-pending patent application discloses a particularly efficacious system and method for providing exclusive access to shared storage that does not rely on advance knowledge of the set of processors potentially accessing the shared storage. Furthermore, it advantageously affords an exclusive access solution that accommodates competing processors without deadlock, accommodates the unpredictable timing properties of a shared storage subsystem and does not rely on the particular properties of any particular shared storage subsystem.
One technique for synchronizing distributed state among a set of processors is known as “distributed consensus”. Functionally, each processor is viewed as a state machine and all processors initially start in the same state. An input to the state machine (i.e. a command) produces an output and a new state. If all the processors agree on the inputs to their state machines (i.e. a consensus), then all of the processors will have the same state. Certain distributed consensus techniques also allow for processors to fail and then catch up with the current state when they restart. One such published algorithm (a.k.a. the “Paxos” algorithm) suitable for a variety of distributed systems is described by Leslie Lamport in “The Part-Time Parliament”, ACM Transactions in Computer Systems, Vol. 16, No. 2, May 1998, pages 133-169, the disclosure of which is herein specifically incorporated by this reference.
To date however, all such distributed consensus processes have utilized communication among the processors in the set in order to obtain consensus. An inherent deficiency of such techniques, is that they then require a majority of a known set of processors to participate in the consensus. If a majority of the processors are not available, the process fails to make forward progress. This is not desirable in those instances where processors are relatively expensive in terms of overall system cost and it is required that but a single surviving processor be able to continue to provide service.
SUMMARY OF THE INVENTION
The system and method of the present invention achieves distributed consensus among members of a processor set even when only a single processor is operating. This is achieved by having a collection of processors jointly implement a virtual state machine and wherein the state machine utilizes a sequence of numbered input commands. System synchronization is achieved by having all of the processors agree on the sequence of input commands so that they execute the same virtual state machine. Input commands are numbered consecutively and the processors use a set of shared stores (i.e. disk drives) to communicate amongst themselves requests (i.e. ballots) for new state machine inputs (or commands) and state machine inputs that have already been chosen (i.e. committed commands). A consensus process is used to decide upon (or commit) each command. Furthermore, this consensus is achieved using a majority of known stores rather than a majority of known processors. Therefore, when consensus is achieved, it then exists on the system stores (e. g. the disk drives) and not in the processors.
In a particular embodiment of the present invention disclosed herein, the process is implemented utilizing a known set of “consensus disks” comprising the shared stores. Each processor participating in distributed consensus has one disk block reserved to that processor on each consensus disk. An exemplary disk block may contain the following information: a) a list of the most recently committed commands; b) a ballot number; c) the command a processor is trying to commit; d) the processor's unique identification (“ID”); and e) any additional information needed to determine the current state of the virtual machine. Each processor also maintains a copy of its current state, and this state may be in the same form as that of the disk blocks.
The procedure for reserving one disk block for each processor on each consensus disk necessitates some means for reserving exclusive access to the disk long enough for a processor to reserve a block. This reservation is recorded in a “directory block” that assigns processor identification (“IDs”) to disk blocks. To this end, known mutual exclusion algorithms may be utilized and the system and method for exclusive access to shared storage disclosed and claimed in the aforementioned patent application incorporated by reference herein, is one particularly efficacious technique.
As disclosed in greater detail herein, an exemplary distributed consensus process in accordance with the present invention requires each processor participating in the consensus algorithm to have a unique ID not shared by any other participating processor. This ID may, in some instances, be conveniently considered to be the low-order digit of all of its ballot numbers in order that ballot numbers issued by different processors are unique and totally ordered. Furthermore, since a processor must be able to read and write a majority of the known set of “consensus disks” in order to make forward progress with this process, each processor that desires to submit a numbered state machine input for consensus agreement (i.e. commit a numbered command) will implement the process.
A representative process for distributed consensus utilizing shared storage resources and state coordination among members of a processor set in a multiprocessor environment as disclosed herein may conveniently operate in two separate rounds. In a first round, a processor is allowed to set its ballot number to a value greater than or equal to its current ballot number. (Generally, the ballot number is chosen to be greater than the numbers of any other ballots in progress). At this point, the processor reads its own disk block on each consensus disk in order to obtain current knowledge of the virtual state machine execution and the ballots it has already issued. If the processor already has knowledge of this information, this step can be omitted.
The processor then writes its current information to its own disk bloc
Gafni Eliezer
Lamport Leslie
Reuter James M.
Compaq Computer Corporation
Kubida William J.
Lembke Kent A.
Wiley David
LandOfFree
System and method for effectuating distributed consensus... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with System and method for effectuating distributed consensus..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for effectuating distributed consensus... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2973039