Electrical computers and digital processing systems: multicomput – Master/slave computer controlling – Master/slave mode selecting
Reexamination Certificate
1997-11-03
2001-08-21
Harrell, Robert B. (Department: 2152)
Electrical computers and digital processing systems: multicomput
Master/slave computer controlling
Master/slave mode selecting
C709S208000, C709S215000, C709S241000, C714S006130
Reexamination Certificate
active
06279032
ABSTRACT:
FIELD OF THE INVENTION
The invention relates generally to computer network servers, and more particularly to computer servers arranged in a server cluster.
BACKGROUND OF THE INVENTION
A server cluster is a group of at least two independent servers connected by a network and managed as a single system. The clustering of servers provides a number of benefits over independent servers. One important benefit is that cluster software, which is run on each of the servers in a cluster, automatically detects application failures or the failure of another server in the cluster. Upon detection of such failures, failed applications and the like can be quickly restarted on a surviving server, with no substantial reduction in service. Indeed, clients of a Windows NT cluster believe they are connecting with a physical system, but are actually connecting to a service which may be provided by one of several systems. To this end, clients create a TCP/IP session with a service in the cluster using a known IP address. This address appears to the cluster software as a resource in the same group (i.e., a collection of resources managed as a single unit) as the application providing the service. In the event of a failure the cluster service “moves” the entire group to another system.
Other benefits include the ability for administrators to inspect the status of cluster resources, and accordingly balance workloads among different servers in the cluster to improve performance. Dynamic load balancing is also available. Such manageability also provides administrators with the ability to update one server in a cluster without taking important data and applications offline. As can be appreciated, server clusters are used in critical database management, file and intranet data sharing, messaging, general business applications and the like.
While clustering is thus desirable for many applications, problems arise when the systems in a cluster stop communicating with one another, known as a partition. This typically occurs, for example, when there is a break in the communications link between systems or when one of the systems crashes. When partitioned, the systems may separate into two or more distinct member sets, with systems in each member set communicating among themselves, but with no members of either set communicating with members of any other sets. Thus, a first problem is determining how to handle the split. One proposed solution is to allow each member set to continue as its own, independent cluster. However, one main difficulty with this approach is that the configuration data (i.e., state of the cluster) that is shared by all cluster members and which is critical to cluster operation may become different in each of the multiple clusters. To subsequently reunite the sets into a common cluster presumes that reconciliation of the data may later take place, however such reconciliation has been found to be an extremely complex and undesirable undertaking.
A simpler solution is to allow only one set to survive and continue as the cluster, however this requires that some determination be made as to which set to select. The known way to make this determination is based on determining which set, if any, has a simple majority of the total systems possible therein, since there can be only one such system.
However, if a cluster shuts down and a new cluster is later formed with no members common to the previous cluster, known as a temporal partition, a problem exists because no new member possesses the state information of the previous cluster. Thus, in addition to deciding representation by which cluster has the most systems, the majority solution further requires that more than half of the total possible systems in a cluster (i.e., a quorum) are communicating within a single member set. This ensures that at least one system is common to any permutation of systems that forms a cluster, thereby guaranteeing that the state of the cluster is persisted across the temporal partition as new clusters having different permutations of systems form from time to time.
A problem with the simple majority/quorum solution is that there is no surviving cluster unless more than half of the systems are operational in a single member set. As a result, a minority member set that otherwise would be capable of operating as a cluster to adequately service clients is not allowed to do so. A related problem arises when forming a cluster for the first time after a total system outage. Upon restart, no one system can form a cluster and allow other systems to join it over time because by itself, that system cannot constitute a quorum. Consequently, intervention by an administrator or a special programmatic process is required to restart the cluster.
SUMMARY OF THE INVENTION
Accordingly, the present invention provides an improved method and system for determining which member set of a partitioned cluster should survive to represent the cluster. The system and method of the present invention allows a minority of a partitioned cluster's systems to survive and operate as the cluster. An arbitration method and system is provided that enables partitioned systems, including those in minority member sets, to challenge for representation of the cluster, and enables the automatic switching of cluster representation from a failed system to an operational system. Temporal partitions are handled, and a single system may form a quorum upon restart from a total cluster outage. The method and system is flexible, extensible and provides for a straightforward implementation into server clusters.
Briefly, the present invention provides a method and system for selecting one set of systems for a cluster from at least two partitioned sets of systems. A persistent storage device with cluster configuration information therein is provided as a quorum resource. Using an arbitration process, one system exclusively reserves the quorum resource. The set with the system therein having the exclusive reservation of the quorum device is selected as the cluster. The arbitration process provides a challenge-defense protocol whereby a system can obtain the reservation of the quorum device when the system that has the reservation fails.
The arbitration process, executed by a partitioned system, first requests exclusive ownership of the quorum device. If the request is successful, that system's set is selected as the cluster. If the request is not successful, the arbitration process breaks another system's exclusive ownership of the quorum resource, delays for a predetermined period of time, and requests in a second request the exclusive ownership by the first system. If the second request is successful, the process selects as the cluster the set with the first system therein. During the time delay, if operational, the other system persists its reservation of the quorum resource whereby the first system's second request will fail.
Other benefits and advantages will become apparent from the following detailed description when taken in conjunction with the drawings, in which:
REFERENCES:
patent: 5280627 (1994-01-01), Flaherty et al.
patent: 5553239 (1996-09-01), Heath et al.
patent: 5659748 (1997-08-01), Kennedy
patent: 5673384 (1997-09-01), Hepner et al.
patent: 5727206 (1998-03-01), Fish et al.
patent: 5754821 (1998-05-01), Cripe et al.
patent: 5781910 (1998-07-01), Gostanian et al.
patent: 5828876 (1998-10-01), Fish et al.
patent: 5828889 (1998-10-01), Moiin et al.
patent: 5892913 (1999-04-01), Adiga et al.
patent: 5893086 (1999-04-01), Schmuck et al.
patent: 5909540 (1999-06-01), Carter et al.
patent: 5917998 (1999-06-01), Cabrera et al.
patent: 5918229 (1999-06-01), Davis et al.
patent: 5940838 (1999-08-01), Schmuck et al.
patent: 5946686 (1999-08-01), Schmuck et al.
patent: 5948109 (1999-09-01), Moiin et al.
patent: 5996075 (1999-11-01), Matena
patent: 5999712 (1999-12-01), Moiin et al.
patent: 6014669 (2000-01-01), Slaughter et al.
patent: 0 760 503 (1997-03-01), None
patent: 0 887 731 (1998-12-01), None
Carr, Richard, “The Tandem Global Update
Gamache Rod
Massa Michael T.
Short Robert T.
Vert John D.
Harrell Robert B.
Michalik & Wylie PLLC
Microsoft Corporation
Vaughn, Jr. William C.
LandOfFree
Method and system for quorum resource arbitration in a... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and system for quorum resource arbitration in a..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for quorum resource arbitration in a... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2511983