Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability
Reexamination Certificate
1998-07-07
2001-06-12
Beausoleil, Robert (Department: 2184)
Error detection/correction and fault detection/recovery
Data processing system error or fault handling
Reliability and availability
C714S006130
Reexamination Certificate
active
06247140
ABSTRACT:
TECHNICAL FIELD
This invention relates to administering operating systems on a distributed data processing system, and more particularly, the invention relates to remote administration of one or more nodes of the data processing system to provide, for example, mirroring of operating system images and/or designating of alternate volume groups for the one or more nodes.
BACKGROUND OF THE INVENTION
Many computer system customers require systems to be available on a seven-day, twenty-four hour basis. One way to provide this high availability is through redundancy so that no component is a single point of failure. In the case of an AIX operating system, i.e., the International Business Machines Corporation's version of the UNIX operating system, redundancy of the operating system image itself is provided via “mirroring” the operating system to separate physical volumes. However, “mirroring” of the operating system on AIX does not lend itself to mirroring on a distributed computer system such as a RISC System/6000 (RS/6000) Scalable POWERparallel Systems (SP) distributed computer system available from International Business Machines Corporation of Armonk, N.Y.
One particular problem in mirroring operating system images on the SP is that the SP has no central point of control for mirroring. No “central point of control” means there is no way to collect and display customer directives regarding mirroring, there is no way to apply mirroring short of logging on to every SP node. Once mirroring is initiated, there is no data available on which nodes are using mirrored volume groups, nor if any nodes are in a failover condition.
Conventionally, if a customer wishes to mirror volume groups, the customer would have to use, for example, IBM Parallel System Support Program (PSSP version 2.1) to install the nodes without mirroring initiated. Post-installation, the customer would log into the node to enter the set of commands to initiate mirroring. The customer would then write an additional short script that would set the bootlist of the node each time the node is booted to reflect the mirrored volume group presence in the list of bootable devices. The customer would then have to repeat this procedure for each node that mirroring is to be initiated on. Once mirroring is initiated, the customer would have to log on to each node to determine which nodes are using mirrored volume groups, and if any node has failed over to a mirrored volume group.
As a related problem, alternate volume groups may need to be created on one or more nodes of the system. A customer may require an alternate volume group when the customer needs to run multiple different copies of the operating system at different times, without forcing a re-install of the node. Different copies of the operating system might be required for different levels of device driver support, or to have “secure” versus “unsecure” levels of data at highly secure installations. Alternate volume groups may provide many of the same problems on the SP as does mirroring. This is again because there is no central point of control for alternate physical volume administration on the SP.
Conventionally, if a customer wishes to use alternate volume groups, for example, to boot a node from different versions of the AIX operating system, the customer would need to enter information via a command or System Management Interface Tool (SMIT) interface to designate the new volume as the volume to install. The customer would then install the alternate volume using, for example, PSSP software. If the customer wishes to change the node to boot from the other alternate device, the customer would have to manually log into the node to modify the bootlist of the node and then reboot the node. As in mirroring, there is no method of determining which nodes are using alternate volume groups short of logging on to every node.
In view of the above, the present invention comprises a method/system of centrally administering alternate and mirrored volume groups of the nodes in a distributed processing system.
DISCLOSURE OF THE INVENTION
Briefly summarized, the present invention comprises in one aspect at least one computer readable medium for storing data usable by a storage controller coupled to a storage device of a distributed processing system. The medium includes a data structure stored within the at least one computer readable medium. The data structure comprises: Node object information usable by the storage controller in identifying at least one target node within the distributed processing system; and a Volume_Group object comprising information on at least one volume group of said at least one target node in the distributed processing system. The Volume_Group object is usable by the storage controller to remotely administer volume groups of the at least one target node of the distributed processing system.
In a further aspect, this invention provides at least one program storage device readable by a machine, tangibly embodying at least one program of instructions executable by the machine to perform a method for administering at least one of a mirrored volume group or an alternate volume group of at least one target node of a computer system. The computer system includes a plurality of processors coupled to a storage controller, the storage controller being coupled to the storage unit. The method includes: storing information in a Node object and a Volume_Group object in the storage unit, the Node object and the Volume_Group object providing information on each volume group of the at least one target node in the computer system; and performing at least one of mirroring of a volume group or designating an alternate volume group of the at least one target node of the computer system. The performing is initiated by the storage controller remote from the at least one target node.
In a still further aspect, an article of manufacture is provided which includes at least one computer usable medium having computer readable program code means embodied therein for administering at least one of a mirrored volume group or an alternate volume group on at least one target node in a distributed processing system having a control node coupled to a system data repository (SDR). The computer readable program code means in the article of manufacture includes: computer readable program code means for causing a computer to store information in a Node object and a Volume_Group object in the SDR, the Node object and the Volume_Group object providing information on each volume group of the at least one target node in the distributed processing system; and computer readable program code means for causing a computer to perform at least one of mirroring of a volume group or designating an alternate volume group of the at least one target node of the distributed processing system.
In a yet further aspect of the present invention, an article of manufacture is provided which includes at least one computer usable medium having computer readable program code means embodied therein for administering a volume group on at least one target node of a distributed processing system having multiple processors, one processor being designated a control node and one or more other processors being designated a target node. The control node is coupled to a system data repository (SDR). The computer readable program code means in the article of manufacture includes: computer readable program code means for causing a computer to store information in a Node object and a Volume_Group object in the SDR, the Node object and the Volume_Group object providing information on each volume group of the at least one target node in the distributed processing system; and computer readable program code means for causing a computer to perform at least one of adding, deleting, modifying, or displaying information about at least one volume group of the at least one target node of the distributed processing system using at least one of the Node object and the Volume_Group object in the SDR.
To restate, this invention provides fo
Chase-Salerno Michael S.
Ferri Richard
Beausoleil Robert
Cutter Lawrence D.
Gonzalez, Esq. Floyd A.
Heslin & Rothenberg, P.C.
International Business Machines - Corporation
LandOfFree
Parallel remote administration of mirrored and alternate... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Parallel remote administration of mirrored and alternate..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Parallel remote administration of mirrored and alternate... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2533785