Delivery of configuration change in a group

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000

Reexamination Certificate

active

06493715

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates generally to distributed computing systems, and specifically to treatment of configuration changes in clusters used in distributed computing applications.
BACKGROUND OF THE INVENTION
Computer clusters are widely used to enable high availability of computing resources, coupled with the possibility of horizontal growth, at reduced cost by comparison with collections of independent systems. Clustering is also useful in disaster recovery. A wide range of clustering solutions are currently available, including 390 Sysplex, RS/6000 SP, HACMP, PC Netfinity and AS/400 Cluster, all offered by IBM Corporation, as well as Tandem Himalaya, Hewlett-Packard Mission Critical Server, Compaq TruCluster, Microsoft MSCS, NCR LifeKeeper and Sun Microsystems Project Cascade. An AS/400 Cluster, for example, supports up to 128 computing nodes, connected via any Internet Protocol (IP) network. A developer of a software application can define and use groups of physical computing entities (such as computing nodes or other devices) or logical computing entities (such as files or processes) to run the application within the cluster environment. In the context of the present patent application and in the claims, such entities are also referred to as group members, and the term “entity” is used to refer interchangeably to physical and logical computing entities.
Distributed group communication systems (GCSS) enable applications to exchange messages within groups of cluster entities in a reliable, ordered manner. For example, the OS/400 operating system kernel for the above-mentioned AS/400 Cluster includes a GCS in the form of middleware for use by cluster applications. This GCS is described in an article by Goft et al., entitled “The AS/400 Cluster Engine: A Case Study,” presented at the International Group Communications Conference IGCC 99 (Aizu, Japan, 1999), which is incorporated herein by reference. The GCS ensures that if a message addressed to the entire group is delivered to one of the group members, the message will also be delivered to all other live and connected members of the group, so that group members can act upon received messages and remain consistent with one another. A group member is considered to be “alive” if it is functioning and able to perform a part in a distributed software application. Typically, “liveness” testing procedures are defined and applied by the GCS to determine which members are alive and which are not.
Another well-known GCS is “Ensemble,” which was developed at Cornell University, as were its predecessors, “ISIS” and “Horus.” Ensemble is described in the “Ensemble Reference Manual,” by Hayden (Cornell University, 1997), which is incorporated herein by reference.
A key function of the GCS is to inform software applications running on the computing group of the identities of the connected set of members in the group. Whenever the group configuration changes, due to one or more members leaving the group or new members joining, the GCS sends out a membership change message with a current, updated membership list. For example, the Ensemble system uses a class called Maestro_GroupMember, described at www.cs.cornell.edu/Info/Projects/Ensemble/Maestro/groud.htm to manage and distribute membership change messages. In this Ensemble class and in other systems known in the art, the form of the membership change message is the same whether the departing members have left the group voluntarily or due to a fault, such as a node crash or network failure. Similarly, such membership change messages contain no information as to the state of new group members and whether or not the new members have been members of this group in the past.
SUMMARY OF THE INVENTION
It is an object of some aspects of the present invention to provide improved methods and systems for enabling computer applications running on a cluster of participating entities to deal with membership changes in the cluster.
In preferred embodiments of the present invention, a group communication system (GCS), for use within a group of clustered computing entities, provides membership change messages to software applications running in the group. These messages not only identify which members have joined or left the group, but also indicate the reasons for the membership change. The reasons are typically gleaned by the GCS from various sources, such as network communication and topology layers, information provided by the members who join or leave the group, and diagnostics and control components of the GCS itself. Knowing the reasons for membership changes can be of crucial importance to many distributed applications, and particularly to cluster applications, such as database and cluster management applications, which must maintain a common state or require consistency among the group members.
Although preferred embodiments described herein are based on a GCS, it will be appreciated that the principles of the present invention may similarly be implemented in substantially any distributed computing environment in which there is a mechanism for keeping track of membership of entities in a computing group or cluster. As noted above, such entities may comprise either physical or logical entities.
There is therefore provided, in accordance with a preferred embodiment of the present invention, a method for controlling operation of a computer software application running on a plurality of computing entities, which are members of a group of mutually-linked computing entities running the application within a distributed computing system, the method including:
receiving an indication of a change in membership of the group together with a reason for the change; and
delivering a membership change message to the members, so as to inform the members of the change and of the reason for the change.
Preferably, the indication is received by group communication system middleware, which delivers the membership change message to the members. Further preferably, receiving the indication of the change includes detecting a failure of the group communication system at a node in the distributed computing system.
Additionally or alternatively, receiving the indication of the change includes discovering a topology change in the distributed computing system, wherein discovering the topology change includes detecting a node in the system that has become available to run the application in the group. Preferably, detecting the node that has become available includes determining whether or not the node was previously separated from the group, and delivering the message includes informing the members as to whether or not the node previously belonged to the group.
Further additionally or alternatively, receiving the indication includes receiving notice of a communication failure in a network linking the computing entities or receiving notice of a failure of a node in the distributed computing system. Preferably, receiving the notice of the failure of the node includes receiving a report of a failure in a liveness check of the node.
Still further additionally or alternatively, receiving the indication includes receiving notice that a new member has joined the group or that one of the members has left the group voluntarily. Preferably, delivering the membership change message includes notifying the other members that the one of the members has left the group voluntarily.
Yet further additionally or alternatively, delivering the membership change message includes notifying the members that one or more members have left the group due to a specified failure in the system or that one or more members, previously separated from the group, have re-merged with the group.
Preferably, delivering the membership change message includes delivering substantially the same message to all of the members of the group, wherein substantially all of the members respond to the message in a mutually-consistent fashion.
There is also provided, in accordance with a preferred embodiment of the present invention, distributed comput

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Delivery of configuration change in a group does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Delivery of configuration change in a group, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Delivery of configuration change in a group will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2975464

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.