Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability
Reexamination Certificate
1998-07-07
2001-06-05
Beausoleil, Robert (Department: 2184)
Error detection/correction and fault detection/recovery
Data processing system error or fault handling
Reliability and availability
C713S002000
Reexamination Certificate
active
06243828
ABSTRACT:
TECHNICAL FIELD
This invention relates to systems for administering operating systems on a distributed data processing system, and more particularly, the invention relates to remote administration of one or more nodes of the data processing system to provide, for example, mirroring of operating system images and/or designating of alternate volume groups for the one or more nodes.
BACKGROUND OF THE INVENTION
Many computer system customers require systems to be available on a seven-day, twenty-four hour basis. One way to provide this high availability is through redundancy so that no component is a single point of failure. In the case of an AIX operating system, i.e., the International Business Machines Corporation's version of the UNIX operating system, redundancy of the operating system image itself is provided via “mirroring” the operating system to separate physical volumes. However, “mirroring” of the operating system on AIX does not lend itself to mirroring on a distributed computer system such as a RISC System/6000 (RS/6000) Scalable POWERparallel Systems (SP) distributed computer system available from International Business Machines Corporation of Armonk, N.Y.
One particular problem in mirroring operating system images on the SP is that the SP has no central point of control for mirroring. No “central point of control” means there is no way to collect and display customer directives regarding mirroring, there is no way to apply mirroring short of logging on to every SP node. Once mirroring is initiated, there is no data available on which nodes are using mirrored volume groups, nor if any nodes are in a failover condition.
Conventionally, if a customer wishes to mirror volume groups, the customer would have to use, for example, IBM Parallel System Support Program (PSSP version 2.1) to install the nodes without mirroring initiated. Post-installation, the customer would log into the node to enter the set of commands to initiate mirroring. The customer would then write an additional short script that would set the bootlist of the node each time the node is booted to reflect the mirrored volume group presence in the list of bootable devices. The customer would then have to repeat this procedure for each node that mirroring is to be initiated on. Once mirroring is initiated, the customer would have to log on to each node to determine which nodes are using mirrored volume groups, and if any node has failed over to a mirrored volume group.
As a related problem, alternate volume groups may need to be created on one or more nodes of the system. A customer may require an alternate volume group when the customer needs to run multiple different copies of the operating system at different times, without forcing a re-install of the node. Different copies of the operating system might be required for different levels of device driver support, or to have “secure” versus “unsecure” levels of data at highly secure installations. Alternate volume groups may provide many of the same problems on the SP as does mirroring. This is again because there is no central point of control for alternate physical volume administration on the SP.
Conventionally, if a customer wishes to use alternate volume groups, for example, to boot a node from different versions of the AIX operating system, the customer would need to enter information via a command or System Management Interface Tool (SMIT) interface to designate the new volume as the volume to install. The customer would then install the alternate volume using, for example, PSSP software. If the customer wishes to change the node to boot from the other alternate device, the customer would have to manually log into the node to modify the bootlist of the node and then reboot the node. As in mirroring, there is no method of determining which nodes are using alternate volume groups short of logging on to every node.
In view of the above, the present invention comprises a method/system of centrally administering alternate and mirrored volume groups of the nodes in a distributed processing system.
DISCLOSURE OF THE INVENTION
Briefly summarized, the present invention comprises in one aspect a distributed processing system which includes multiple processors. One processor of the system is designated a control node and one or more other processors are each designated a target node. The system also has a system data repository (SDR) coupled to the control node, and means for administering at least one of a mirrored volume group or an alternate volume group of the at least one target node from the control node. The means for administering includes: means for storing information in a Node object and a Volume_Group object in the SDR, the Node object and the Volume_Group object providing information on each volume group of the at least one target node in the distributed processing system; and means for performing at least one of mirroring of a volume group or designating an alternate volume group of the at least one target node of the distributed processing system, the means for performing being initiated at the control node remote from the at least one target node.
In another aspect, a distributed processing system is provided which includes multiple processors. One processor is designated a control node and one or more other processors are each designated a target node. A system data repository (SDR) is coupled to the control node and means for administering a volume group on at least one target node is provided. The means for administering includes: means for storing information in a Node object and Volume_Group object in the SDR, the Node object and the Volume_Group object providing information on each volume group of the at least one target node in the distributed processing system; and means for performing at least one of adding, deleting, modifying, or displaying information about at least one volume group of the at least one target node of the distributed processing system using at least one of the Node object and the Volume_Group object in the SDR.
In still another aspect, a distributed processing system is provided which includes multiple processors. One processor is designated a control node and one or more other processors are each designated a target node. A system data repository (SDR) is coupled to the control node. The control node is adapted to administer at least one of a mirrored volume group or an alternate volume group of the at least one target node. The control node includes a storage controller adapted to store information in a Node object and a Volume_Group object in the SDR. The Node object and the Volume_Group object provide information on each volume group of the at least one target node in the distributed processing system. The storage controller is further adapted to perform at least one of mirroring of a volume group or designating an alternate volume group of the at least one target node of the distributed processing system.
To restate, this invention provides for central administration of one or more remote nodes of a distributed data processing system. The invention allows for mirroring of operating system images and/or designating of alternate volume groups for the one or more remote nodes. The invention described herein provides a new data class in the system data repository (SDR) to retain information about volume groups at the nodes of the system. This central repository is coupled to the control node for central administration of the physical volumes of the nodes. An ability to create, modify, and delete this new data class (or volume group information) is also provided, as is the ability to form new volume groups for each node based upon information in the repository. A set of commands is provided to initiate and/or discontinue mirroring on multiple nodal volume groups of the system in parallel from the control node. Further, the invention provides for administering the bootlist on a node or set of nodes remotely and in parallel from the control node, again allowing for mirroring and alternate volume groups. Also, the
Chase-Salerno Michael S.
Ferri Richard
Beausoleil Robert
Gonzalez, Esq. Floyd A.
Heslin & Rothenberg, P.C.
International Business Machines Corp.
Ziemer Rita
LandOfFree
System for parallel, remote administration of mirrored and... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with System for parallel, remote administration of mirrored and..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System for parallel, remote administration of mirrored and... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2449202