System for clustering software applications

Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

06701453

ABSTRACT:

APPENDICES
Appendix A, which forms a part of this disclosure, is a list of commonly owned copending U.S. patents and patent applications. Each one of the patents and applications listed in Appendix A is hereby incorporated herein in its entirety by reference thereto.
Appendix B, which forms part of this disclosure, is a copy of the U.S. provisional patent application filed May 13, 1997, entitled “Clustering of Computer Systems Using Uniform Object Naming and Distributed Sotware For Locating Objects” and assigned Application No. 60/046,327. Page 1, line 7 of the provisional application has been changed from the original to positively recite that the entire provisional application, including the attached documents, forms part of this disclosure.
COPYRIGHT RIGHTS
A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent files or records, but otherwise reserves all copyright rights whatsoever.
BACKGROUND OF THE INVENTION
1. Field of the Invention
The invention relates to fault tolerant computer systems. More particularly, the invention relates to providing fault tolerant execution of application programs in a server network, by providing a method and system for executing an application program in a backup server if it is determined that a primary server, which normally executes the program, has failed.
2. Description of the Related Technology
As computer systems and networks become more complex and capital intensive, system failures which result in lost data and/or inaccessible applications have become unacceptable. In the computer industry, the reduction of computer failures and computer “downtime” is a major focus for companies trying to achieve a competitive edge over their competitors. The reduction of downtime due to system failures and maintenance is critical to providing quality performance and product reliability to the users and buyers of computer systems. Particularly with respect to server computers which are accessed and utilized by many end users, the reduction of server downtime is an extremely desirable performance characteristic. This is especially true for users who depend on the server to obtain data and information in their daily business operations.
As servers become more powerful, they are also becoming more sophisticated and complex. A server is typically a central computer in a computer network which manages common data and application programs that may be accessed by other computers, otherwise known as “workstations,” in the network. Server downtime, resulting from hardware or software faults or from repair and maintenance, continues to be a significant problem today. By one estimate, the cost of downtime in mission critical environments has risen to an annual total of $4.0 billion for U.S. businesses, with the average downtime event resulting in a $140 thousand loss in the retail industry and a $450 thousand loss in the securities industry. It has been reported that companies lose as much as $250 thousand in employee productivity for every 1% of computer downtime. With emerging internet, intranet and collaborative applications taking on more essential business roles every day, the cost of network server downtime will continue to spiral upward.
Various systems for promoting fault tolerance have been devised. To prevent network down time due to power failure, uninterruptible power supplies (UPS) are commonly used. Basically a rechargeable battery, a UPS provides insurance that a workstation or server will survive during even extended periods of power failures.
To prevent network downtime due to failure of a storage device, data mirroring was developed. Data mirroring provides for the storage of data on separate physical devices operating in parallel with respect to a file server. Duplicate data is stored on separate drives. Thus, when a single drive fails the data on the mirrored drive may still be accessed.
To prevent network downtime due to a failure of a print/file server, server mirroring has been developed. Server mirroring as it is currently implemented requires a primary server and storage device, a backup server and storage device, and a unified operating system linking the two. An example of a mirrored server product is the Software Fault Tolerance level 3 (SFT III) product by Novell Inc., 1555 North Technology Way, Orem, Utah, as an add-on to its NetWare □ 4.x product. SFT III maintains servers in an identical state of data update. It separates hardware-related operating system (OS) functions on the mirrored servers so that a fault on one hardware platform does not affect the other. The server OS is designed to work in tandem with two servers. One server is designated as a primary server, and the other is a secondary server. The primary server is the main point of update; the secondary server is in a constant state of readiness to take over. Both servers receive all updates through a special link called a mirrored server link (MSL), which is dedicated to this purpose. The servers also communicate over the local area network (LAN) that they share in common, so that one knows if the other has failed even if the MSL has failed. When a failure occurs, the second server automatically takes over without interrupting communications in any user-detectable way. Each server monitors the other server's NetWare Core Protocol (NCP) acknowledgments over the LAN to see that all the requests are serviced and that OSs are constantly maintained in a mirrored state.
When the primary server fails, the secondary server detects the failure and immediately takes over as the primary server. The failure is detected in one or both of two ways: the MSL link generates an error condition when no activity is noticed, or the servers communicate over the LAN, each one monitoring the other's NCP acknowledgment. The primary server is simply the first server of the pair that is brought up. It then becomes the server used at all times and it processes all requests. When the primary server fails, the secondary server is immediately substituted as the primary server with identical configurations. The switch-over is handled entirely at the server end, and work continues without any perceivable interruption.
Power supply backup, data mirroring, and server mirroring all increase security against down time caused by a failed hardware component, but they all do so at considerable cost. Each of these schemes requires the additional expense and complexity of standby hardware, that is not used unless there is a failure in the network. Mirroring, while providing redundancy to allow recovery from failure, does not allow the redundant hardware to be used to improve cost/performance of the network.
What is needed is a fault tolerant system for computer networks that can provide all the functionality of UPS, disk mirroring, or server mirroring without the added cost and complexity of standby/additional hardware. What is needed is a fault tolerant system for computer networks which smoothly interfaces with existing network systems. Additionally, what is needed is a method or system of clustering application software programs which may be executed by servers within the network such that a software application being executed on a first server may be “backed-up”, e.g., clustered, by a second server which continues execution of the application if for some reason the first server fails.
SUMMARY OF THE INVENTION
The invention addresses the above and other needs by providing a method and system for clustering software application programs which are executable by one or more servers in a server network.
In one embodiment, a system for fault tolerant execution of an application program in a server network, includes: a first server for executing the application program; a cluster network database, coupled to the first server; an object, stored in the cluster n

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System for clustering software applications does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System for clustering software applications, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System for clustering software applications will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3276133

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.