Electrical computers and digital processing systems: memory – Addressing combined with specific memory configuration or...
Reexamination Certificate
1998-07-29
2001-04-03
Bragdon, Reginald G. (Department: 2186)
Electrical computers and digital processing systems: memory
Addressing combined with specific memory configuration or...
C709S220000, C711S154000, C711S156000, C711S159000
Reexamination Certificate
active
06212595
ABSTRACT:
The present invention is related to fencing of nodes in a distributed processing environment, and is more particularly related to fencing of nodes in a shared disk subsystem.
BACKGROUND OF THE INVENTION
U.S. Pat. No. 4,919,545 issued Apr. 24, 1990 to Yu for DISTRIBUTED SECURITY PROCEDURE FOR INTELLIGENT NETWORKS, discloses a security technique for use in an intelligent network and includes steps of granting permission to an invocation node to access an object by transmitting a capability and a signature from an execution node to the invocation node thereby providing a method for authorizing a node to gain access to a network resource by using a form of signature encryption at the node.
U.S. Pat. No. 5,301,283 issued Apr. 5, 1994 to Thacker et al. for DYNAMIC ARBITRATION FOR SYSTEM BUS CONTROL IN MULTIPROCESSOR DATA PROCESSING SYSTEM discloses a data processing system having a plurality of commander nodes and at least one resource node interconnected by a system bus, and a bus arbitration technique for determining which commander node is to gain control of the system bus to access the resource node thereby providing a node lockout which prevents nodes from gaining access to the system bus.
U.S. Pat. No. 5,386,551 issued Jan. 31, 1995 to Chikira et al. for DEFERRED RESOURCES RECOVERY discloses a resources management system for fencing all autonomous resources, and a protocol is followed to allow all activities in a work stream to be completed before all fencing is removed.
U.S. Pat. No. 5,416,921 issued May 16, 1995 to Frey et al. for APPARATUS AND ACCOMPANYING METHOD FOR USE IN A SYSPLEX ENVIRONMENT FOR PERFORMING ESCALATED ISOLATION OF A SYSPLEX COMPONENT IN THE EVENT OF A FAILURE discloses an apparatus for use in a multi-system shared data environment which fences through a pre-defined hierarchical order, failed components from accessing shared data in order to protect data integrity.
U.S. Pat. No. 5,423,044 issued Jun. 6, 1995 to Sutton et al. for SHARED, DISTRIBUTED LOCK MANAGER FOR LOOSELY COUPLED PROCESSING SYSTEMS discloses apparatus for managing shared, distributed locks in a multiprocessing complex for synchronizing data access to identifiable subunits of direct access storage devices.
The Virtual Shared Disk (VSD) product, which is a component of the Parallel System Support Programs for AIX (PSSP) from the International Business Machines Corp. of Armonk, N.Y., provides raw disk access to all nodes on a RS/6000 Scalable POWERparallel (SP) system. The disk itself, however, is physically connected to only two nodes. One of these nodes is a VSD primary server, and the other is a backup server. If a disk is not locally attached, the VSD kernel extension will use Internet Protocol to route the requests to the server node. If the primary node is unavailable for any reason, access is switched to the secondary node, and the data on the disk drive may still be accessed by the secondary node.
The Group Services product of PSSP keeps a record of member nodes in a group of nodes. It is desirable to provide a fencing function to the VSD subsystem to provide fencing support.
In the case that a process instance using VSDs on node X is unresponsive, a distributed subsystem may wish to ensure that X's access to a set of virtual disks (VSDs) is severed, and all outstanding I/O initiated by X to these disks are flushed before recovery can proceed. Fencing X from a set of VSDs denotes that X will not be able to access these VSDs (until it is unfenced). Fence attributes must survive node Initial Program Loads (IPLs).
SUMMARY OF THE INVENTION
The present invention provides a distributed computer system having a plurality of nodes, one of the nodes being a request processing node (A node) and one or more nodes being peripheral device server nodes (S nodes), an apparatus for fencing or unfencing in a fence/unfence operation, and one or more nodes (X nodes) from said S nodes. The apparatus includes a common memory for storing a fence map having entries therein, each entry for storing an indication of an S node to be fenced, a commit bit indicating if the entry is proposed or committed, and a bit map indicating which X nodes are to be fenced from the S node of the entry. Each of the plurality of nodes includes a local memory for storing a local copy of said fence map. A node processes a request specifying X nodes to be fenced or unfenced from specified S nodes during said fence/unfence operation, and computes the nodes to participate (F nodes) in the fence/unfence operation. The participating nodes includes the A node, the X nodes to be either fenced or unfenced from said S nodes, and the S nodes thus fenced or unfenced. The A node sends messages to the F nodes instructing each F node to begin the fence/unfence operation for that node. The fence/unfence operation includes a first phase for proposing changes in the fence map reflecting the fencing or unfencing of said X nodes; a second phase for refreshing the local map of each of the F nodes from the proposed changes in the fence map in said central memory, for eliminating access to specified S nodes from specified X nodes to be fenced, if any, and for restoring access to specified S nodes with specified X nodes is to be unfenced, if any; and a third phase for flushing I/O operations from specified X nodes to be fenced from specified S nodes, if any, and for a selected one of the F nodes to erase all entries in the fence map of the common memory whose commit bit indicates the entry is committed, and for changing all entries whose commit bit indicates the entry is proposed, to a committed entry.
Thus a primary object of the present invention is to provide a computer program product for fencing selected ones of the X nodes from access to selected ones of the S nodes, and for unfencing selected ones of said X node such that they have access to selected ones of said S nodes.
It is also an object of the present invention to provide a computer program product wherein the lowest numbered node of the F nodes to change proposed changes to the fence map stored in the common memory to committed entries at the end of the fence/unfence operation.
It is another object of the present invention to provide a computer program product for allowing any node of the plurality of nodes to send a request to the A node to start a fence/unfence operation.
It is another object of the present invention to provide a computer program product wherein a protocol undoes the proposed changes to the fence map in the event that a node fails during the fence/unfence operation.
It is another object of the present invention to provide a computer program product wherein a protocol removes the request from the request queue for processing by the A node in the event that a node fails during the fence/unfence operation.
REFERENCES:
patent: 4683563 (1987-07-01), Rouse et al.
patent: 4919545 (1990-04-01), Yu
patent: 5301283 (1994-04-01), Thacker et al.
patent: 5313585 (1994-05-01), Jeffries et al.
patent: 5386551 (1995-01-01), Chikira et al.
patent: 5416921 (1995-05-01), Frey et al.
patent: 5423044 (1995-06-01), Sutton et al.
patent: 5568491 (1996-10-01), Beal et al.
patent: 5675724 (1997-10-01), Beal et al.
patent: 5963963 (1999-10-01), Schmuck et al.
patent: 5991264 (1999-11-01), Croslin
patent: 5996075 (1999-11-01), Matena
patent: 5999712 (1999-12-01), Moiin et al.
patent: 6038604 (2000-03-01), Bender et al.
Chung-Sheng Li et al., Automatic Fault Detection, Isolation, and Recovery in Transparent All-Optical Networks, Journal of Lightwave Technology, pp. 1784-1793, Oct. 1997.*
Y. Ofek et al., Generating a Fault-Tolerant Global Cock Using High-Speed Control Signals for the MetaNet Architecture, IEEE Transactions on Communications, pp. 2179-2188, May 1994.*
G. Alari et al., “Fault-Tolerant Hierarchical Routing”, IEEE International Conference on Performance, Computing, and Communications, pp. 159-165, 1997.*
Aldred, M. “A Distributed Lock Manager on Fault Tolerant MPP”, System Sciences, pp. 134-136, 1995.*
Sankar, R., et al., An Automatic Failure Isolation and Reconfiguration Metho
Bataille Pierre-Michel
Bragdon Reginald G.
Cutter Lawrence D.
Gonzalez Floyd A.
International Business Machines - Corporation
LandOfFree
Computer program product for fencing a member of a group of... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Computer program product for fencing a member of a group of..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Computer program product for fencing a member of a group of... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2452769