Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability
Reexamination Certificate
2001-06-26
2004-04-27
Le, Dieu-Minh (Department: 2114)
Error detection/correction and fault detection/recovery
Data processing system error or fault handling
Reliability and availability
C714S011000
Reexamination Certificate
active
06728896
ABSTRACT:
FIELD OF THE INVENTION
This invention involves a clustering environment where multiple servers have shared storage and where the failure of one server can be accommodated by operations on a second server.
BACKGROUND OF THE INVENTION
In the art of computer systems, it is known to provide clustered computing environments. A “clustered” computer system can be defined as a collection of computer resources having some redundant elements. These redundant elements provide flexibility for load balancing among the elements, or for failover from one element to another, should one of the elements fail. From the viewpoint of users outside the cluster, these load-balancing or failover operations are ideally transparent. For example, a mail server associated with a given Local Area Network (LAN) might be implemented as a cluster, with several mail servers coupled together to provide uninterrupted mail service by utilizing redundant computing resource to handle load variations for server failures.
One difficult issue arising in the context of clustered computer systems is the licensing of the operating system or application software running aboard such systems. Typically, multiple-CPU or multiple-server licenses, that enable the licensee to run the licensed software on several systems concurrently, are more expensive than a single-CPU or single server license. Therefore, whenever possible, the system administrators responsible for clustered systems will favor installing software licensed under a “single-CPU license” in order to reduce license fees.
However, this approach can be problematic should the single server running the licensed software fail, in which case that software will become unavailable to the entire cluster, thereby defeating the purpose of implementing the cluster in the first place. Accordingly, there exists a pressing need in the art for a system enabling “single license-software” to run in a clustered environment while still taking full advantage of the load balancing and failover characteristics of clustered systems. The Unisys Corporation of Blue Bell, Pennsylvania, produces servers designated as ClearPath LX Servers, which offer a completely integrated heterogeneous small-type processor tower or rack-mounted server. These are provided with unique emulation architecture whereby each of the servers can combine two operating environments in a single platform, that is to say, the Unisys Master Control Program (MCP) and Microsoft's Windows 2000 Advanced Server.
The Unisys ClearPath LX Servers require no additional hardware other than what comes with the typical Unisys Intel platform which is running Microsoft Windows 2000 Advanced Server software to execute the Master Control Program (MCP) and associated User application. The LX Servers will still continue to execute code for both the Windows 2000 operating system, and also the MCP operating system.
The ClearPath LX software works to enhance the features and functionalities of already-provided LX software releases. These provide for: (i) open server flexibility, where integrated operating environments work concurrently to support both Windows 2000 Advanced Server and Unisys MCP operating environments; (ii) seamless client-server integration; (iii) scaleable computing with entry to mid-range enterprise server performance on current Intel processor technology; (iv) Open Industry Standards, where an enterprise server built on open standards, such as SCSI Ultra 802.3 (Ethernet) LAN interconnect, fibre channel connections with copper and optical interfaces, and open software interfaces.
The presently described system will utilize the LX platform which is comprised of two servers and one shared disk subsystem. A Virtual Machine for ClearPath MCP software is integrated with Microsoft Cluster Service (MSCS) via the MSCS API's and allow the Master Control Program (MCP) to failover on the clustered LX platform. The Virtual Machine for the MCP software will now have integrated the Microsoft cluster API's which then enables the Microsoft Cluster Services (MSCS).
Clustering is a useful and important function to many users by addressing and allowing customers to utilize clustering applications and increasing their reliability as is done in the present application. A higher quality and more reliable product is provided. Thus, with the use of two separate systems there is provided a certain amount of redundancy which together with the shared disk subsystems, makes accessibility and reliability a very primary element.
It may be understood that normally, internal disk drives do not “failover” to a standby system with Microsoft's clustering Services.
SUMMARY OF THE INVENTION
This invention provides a computer program product which is loaded onto a storage medium containing executable code that is readable by the computer and that programs the computer to perform method steps which operate on two separate servers having a shared disk resource.
The method involves executing a first operating system on a first server, which has a first and second operating system. Also, executing a first operating system on a second server and executing a second operating system on the first server as an application running under the first server. Then, the method detects any failure of the first server whereupon in response to the failure, the code transfers execution of the second operating system from the first server, to the second operating system of the second server.
Subsequently at some point afterwards, the code detects a re-starting of the first server, and in response to the re-start, it then returns the execution of the second operating system from the second server back to the second operating system of the first server.
The detection of failure and subsequent re-starting of the first server is done by sensing a “heartbeat” communication on a “Private Network” connection between the server systems. The disappearance of this heartbeat signal (from either of the server systems) will indicate a failure of that server system.
At any one time, the second operating system is executing itself on only one of the first servers or alternatively at the second server, thereby enabling the clustering environment to operate under the terms of a “single-server” or “single-CPU” license, while at the same time, realizing the benefits of cluster implementation.
REFERENCES:
patent: 5218679 (1993-06-01), Hasegawa et al.
patent: 5633999 (1997-05-01), Clowes et al.
patent: 5812748 (1998-09-01), Ohran et al.
patent: 5860122 (1999-01-01), Owada et al.
patent: 5913034 (1999-06-01), Malcolm
patent: 5978565 (1999-11-01), Ohran et al.
patent: 5996086 (1999-11-01), Delaney et al.
patent: 6073220 (2000-06-01), Gunderson
patent: 6134673 (2000-10-01), Chrabaszcz
Collins Jason
Cox John Robert
Forbes Steven Lee
Miura Amy Liu
Kozak Alfred W.
Le Dieu-Minh
Rode Lise A.
Starr Mark T.
Unisys Corporation
LandOfFree
Failover method of a simulated operating system in a... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Failover method of a simulated operating system in a..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Failover method of a simulated operating system in a... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3205502