Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability
Reexamination Certificate
1998-04-17
2001-06-05
Lim, Krisna (Department: 2153)
Error detection/correction and fault detection/recovery
Data processing system error or fault handling
Reliability and availability
C709S249000, C709S239000, C709S228000
Reexamination Certificate
active
06243825
ABSTRACT:
FIELD OF THE INVENTION
The invention relates generally to computer network servers, and more particularly to computer servers arranged in a server cluster.
BACKGROUND OF THE INVENTION
A server cluster is a group of at least two independent servers connected by a network and managed as a single system. The clustering of servers provides a number of benefits over independent servers. One important benefit is that cluster software, which is run on each of the servers in a cluster, automatically detects application failures or the failure of another server in the cluster. Upon detection of such failures, failed applications and the like can be terminated and restarted on a surviving server.
Other benefits include the ability for administrators to inspect the status of cluster resources, and accordingly balance workloads among different servers in the cluster to improve performance. Dynamic load balancing is also available. Such manageability also provides administrators with the ability to update one server in a cluster without taking important data and applications offline. As can be appreciated, server clusters are used in critical database management, file and intranet data sharing, messaging, general business applications and the like.
Thus, the failover of an application from one server (i.e., machine) to another in the cluster may be automatic in response to a software or hardware failure on the first machine, or alternatively may be manually initiated by an administrator. However, unless an application is “cluster-aware” (i.e., designed with the knowledge that it may be run in a clustering environment), problems arise during failover.
One problem with existing applications which are not cluster-aware, i.e., legacy applications, is that such applications assume that the current machine name is the only computer name. Consequently, if the application exposes the machine name to clients, or writes the machine name into its persistent configuration information, the system will not function correctly when the application fails over and runs on a different machine having a different machine name. By way of example, an electronic mail application program provides its machine name to other machines connected thereto in a network. If the application is running in a cluster and is failed over to another machine, this other machine's name will not be the name that was provided to the other network machines, and the application will not function correctly.
A cluster-aware application avoids this problem when it is running in a cluster by allowing multiple machine names and calling a cluster-specific application programming interface (API) that returns a virtual computer name regardless of the actual cluster machine on which the application is being run. However, it is not practical to change the many legacy applications so as to be cluster-aware applications. At the same time, other applications on the same machine may need to receive different computer names (e.g., a different virtual computer name or its actual machine name) rather than any particular virtual machine name in response to a request for its computer name. As a result, it is not feasible to develop an interface that simply returns a single virtual computer name each time such a request is made by an application.
SUMMARY OF THE INVENTION
The present invention provides a method and system for providing a legacy application with a single virtual machine identity that is independent of the physical machine identity, whereby the application can run on any physical machine in a cluster. The method and system selectively return an appropriate computer name based on whether the application is set for failing over in a cluster.
Briefly, the present invention transparently fails over a computer name with a legacy application running in a server cluster by returning a virtual computer name to the application. The virtual computer name moves with the application regardless of the machine on which it is running. When a cluster receives a request to run an application, a process environment block associated with the application is set up, as described below. If the application is set for failing over in the cluster, the cluster software locates a virtual computer name on which the application is dependent, regardless of the machine on which the application is running, by searching a dependency tree of cluster resources associated with the application. The cluster software then writes the virtual computer name into the process environment block, and the application is run. When the system receives a request from the application to return a computer name thereto, the system looks for the virtual computer name in the process environment block, and, if detected, returns the virtual computer name to the application as the computer name. When the application is not set for failing over in the cluster, the virtual name is not written into the process environment block and the system instead returns the actual machine name.
Other benefits and advantages will become apparent from the following detailed description when taken in conjunction with the drawings, in which:
REFERENCES:
patent: 4736393 (1988-04-01), Grimes et al.
patent: 5021949 (1991-06-01), Morten et al.
patent: 5027269 (1991-06-01), Grant et al.
patent: 5117352 (1992-05-01), Falek
patent: 5128885 (1992-07-01), Janis et al.
patent: 5165018 (1992-11-01), Simor
patent: 5301337 (1994-04-01), Wells et al.
patent: 5341372 (1994-08-01), Kirkham
patent: 5398329 (1995-03-01), Hirata et al.
patent: 5416777 (1995-05-01), Kirkham
patent: 5423037 (1995-06-01), Hvasshovd
patent: 5434865 (1995-07-01), Kirkham
patent: 5435003 (1995-07-01), Chng
patent: 5490270 (1996-02-01), Devarakonda et al.
patent: 5491800 (1996-02-01), Goldsmith et al.
patent: 5537532 (1996-07-01), Chng et al.
patent: 5568491 (1996-10-01), Beal et al.
patent: 5666486 (1997-09-01), Alfieri et al.
patent: 5666538 (1997-09-01), DeNicola
patent: 5710727 (1998-01-01), Mitchell et al.
patent: 5715389 (1998-02-01), Komori et al.
patent: 5737601 (1998-04-01), Jain et al.
patent: 5745669 (1998-04-01), Hugard et al.
patent: 5754752 (1998-05-01), Sheh et al.
patent: 5754877 (1998-05-01), Hagersten et al.
patent: 5757642 (1998-05-01), Jones
patent: 5768523 (1998-06-01), Schmidt
patent: 5768524 (1998-06-01), Schmidt
patent: 5781737 (1998-07-01), Schmidt
patent: 5787247 (1998-07-01), Norin et al.
patent: 5794253 (1998-08-01), Norin et al.
patent: 5805839 (1998-09-01), Singhal
patent: 5806075 (1998-09-01), Jain et al.
patent: 5812779 (1998-09-01), Ciscon et al.
patent: 5815649 (1998-09-01), Utter et al.
patent: 5819019 (1998-10-01), Nelson
patent: 5822532 (1998-10-01), Ikeda
patent: 5832514 (1998-11-01), Norin et al.
patent: 5857073 (1999-01-01), Tsukamoto et al.
patent: 5919247 (1999-07-01), Van Hoff et al.
patent: 5933422 (1999-08-01), Kusano et al.
patent: 5935230 (1999-08-01), Pinai et al.
patent: 5940870 (1999-08-01), Chi et al.
patent: 5946689 (1999-08-01), Yanaka et al.
patent: 5963960 (1999-10-01), Swart et al.
patent: 5968121 (1999-10-01), Logan et al.
patent: 5968140 (1999-10-01), Hall
patent: 5982747 (1999-11-01), Ramfelt et al.
patent: 5991771 (1999-11-01), Falls et al.
patent: 5991893 (1999-11-01), Snider
patent: 6003075 (1999-12-01), Arendt et al.
patent: 6044367 (2000-03-01), Wolff
patent: 6047323 (2000-04-01), Krause
patent: 6134673 (2000-10-01), Chrabaszcz
Carr, Richard, “The Tandem Global Update Protocol,”Tandem Systems Review,vol. 1, No. 2, 74-85 (1985).
Lamport, Leslie, A Fast Mutual Exclusion Algorithm, Digital Equipment Corporation, Oct. 31, 1986.
Lamport, Leslie, The Part-Time parliament, Digital Equipment Corporation, Sep. 1, 1989.
Chen et al., “Designing Mobile Computing Systems Using Distributed Objects,” IEEE Communications Magazine, vol. 35, No. 2, pp. 62-70 (Feb. 1997), http: iel.his.com: 80 cgi-bin?iel13egi?se . . . 2ehts printed May 21, 1999.
Chowdhury, et al., “Supporting Dynamic Space-Sharing on Clusters of Non-dedicated Workstations,” International Conference on Distributed Computing Syst
Gamache Rod
Lucovsky Mark
Vert John D.
Lim Krisna
Michalik & Wylie PLLC
Microsoft Corporation
LandOfFree
Method and system for transparently failing over a computer... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and system for transparently failing over a computer..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for transparently failing over a computer... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2500033