Electrical computers and digital data processing systems: input/ – Access locking
Reexamination Certificate
1999-11-23
2003-02-18
Thai, Xuan M. (Department: 2181)
Electrical computers and digital data processing systems: input/
Access locking
Reexamination Certificate
active
06523078
ABSTRACT:
The present invention relates to a locking system and method for use in a multi-node distributed clustering product.
BACKGROUND OF THE INVENTION
Multi-processing systems are commonly configured in a cluster of related nodes to ensure high availability. A clustered system is a collection of processing elements that is capable of executing a parallel, cooperating application. Each processing element in a cluster is an independent functional unit, such as a symmetric multiprocessor server, which is coupled with the other cluster elements through one or more networks. One type of cluster system is described in U.S. Pat. No. 5,117,352 entitled “MECHANISM FOR FAIL-OVER NOTIFICATION” issued to Louis Falek on May 26, 1992 and assigned to Digital Equipment Corporation.
In a clustered environment, there is often a need for one node to provide backup upon failure of another node. For example, in a three-node cluster, an application may be in service on node A, with node B configured as the highest priority backup node. If node A crashes, then node B begins to bring the application in service automatically. If a system administrator simultaneously attempts to bring the application in service on node C, then there is the possibility of the application being brought into service on nodes B and C simultaneously.
To prevent this possibility of the application being brought into service simultaneously on two nodes, many multi-processing systems possess either a quorum device or some other mechanism to create a single, global cluster configuration database. For these systems, it is sufficient for each node to obtain a single lock on the central cluster configuration database itself. All updates to the cluster configuration are serialized, so all nodes in the cluster have the same view of the cluster configuration insuring that only one node will attempt to bring an application into service.
Other types of clustered systems, such as systems running LifeKeeper (trademark of NCR Corp., Dayton, Ohio), possess a distributed system for storing cluster configuration information. Accordingly, each node keeps its own view of the cluster configuration (e.g. which nodes are currently servicing an application, which nodes or communication paths are alive, etc.). Clustered systems possessing such a distributed system for storing cluster configuration information may use a distributed locking system to prevent two or more nodes from making changes to the cluster configuration simultaneously. U.S. Pat. No. 5,828,876, Fish et al., issued on Oct. 27, 1998, assigned to NCR Corporation and entitled “File System For A Clustered Processing System” describes a distributed system and is hereby incorporated by reference.
However, current distributed locking systems may allow a starvation problem typical in distributed software and prevent a thread from acquiring a cluster wide lock indefinitely. Chances of a starvation problem occurring increases with the number of nodes in the cluster. Additionally, current distributed locking systems may fail to handle a time value in a unit smaller than a millisecond and may fail to take into account many configuration features of the clustered system.
Accordingly, there is a need for an improved distributed locking system and method which avoids the problems discussed above.
SUMMARY OF THE INVENTION
In accordance with the teachings of the present invention, an improved distributed locking system and method for a clustered system having a distributed system for storing cluster configuration information is provided. One aspect of the present invention allows a process or thread in a high availability solution to obtain a distributed lock on all relevant nodes in a clustered system. Another aspect of the present invention allows more than one thread to obtain a lock and perform a critical operation on different nodes concurrently.
REFERENCES:
patent: 5117352 (1992-05-01), Falek
patent: 5222217 (1993-06-01), Blount et al.
patent: 5414839 (1995-05-01), Joshi
patent: 5596754 (1997-01-01), Lomet
patent: 5612865 (1997-03-01), Dasgupta
patent: 5613139 (1997-03-01), Brady
patent: 5673384 (1997-09-01), Hepner et al.
patent: 5699500 (1997-12-01), Dasgupta
patent: 5727206 (1998-03-01), Fish et al.
patent: 5828876 (1998-10-01), Fish et al.
patent: 5832222 (1998-11-01), Dziadosz et al.
patent: 5862312 (1999-01-01), Mann et al.
patent: 5920872 (1999-07-01), Grewell et al.
patent: 5924122 (1999-07-01), Cardoza et al.
patent: 5987477 (1999-11-01), Schmuck et al.
patent: 6108654 (2000-08-01), Chan et al.
patent: 6151659 (2000-11-01), Solomon et al.
patent: 6389420 (2002-05-01), Vahalia et al.
Nelson Mullins Riley & Scarborough
Steeleye Technology, Inc.
Thai Xuan M.
LandOfFree
Distributed locking system and method for a clustered system... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Distributed locking system and method for a clustered system..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Distributed locking system and method for a clustered system... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3142511