Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability
Reexamination Certificate
2000-06-02
2004-07-13
Baderman, Scott (Department: 2184)
Error detection/correction and fault detection/recovery
Data processing system error or fault handling
Reliability and availability
C714S043000
Reexamination Certificate
active
06763479
ABSTRACT:
A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention is related to the field of computer networks and, more particularly, to maintaining the high availability in a two node computer network which utilizes alternate pathing technology.
2. Description of the Related Art
With the ever expanding use of computer networks throughout society has come an increasing dependence of users on the availability of that network. If a network goes down, or is otherwise unavailable, costs to an enterprise may be significant. Consequently, a number of techniques have arisen which are designed to ensure that a computer network is sufficiently robust that it may detect and respond to problems without significantly impacting users. Frequently, efforts to ensure a computer network is consistently online for its users may be referred to as maintaining “high availability”. A computer network which has in place mechanisms which prevent hardware or software problems from impacting its users may be referred to as a High Availability Network (HAnet). Some of the characteristics which may be considered when defining a HAnet include protection of data (Reliability), continuous access to data (Availability), and techniques for correcting problems which minimally impact users (Serviceability). Collectively these characteristics are frequently referred to as RAS.
In some cases it is desirable to create a computer network which includes a two node Local Area Network (LAN). For example, it may be desirable to have a two node LAN consisting of a database server and its corresponding application server. These servers may be connected to each other using a well known method using crossover cables. A crossover cable is a cable that is used to connect two computers by reversing their respective pin contacts. Using crossover cables may have the advantage of being highly secure, performing well, and eliminating several components typically present in a computer network, such as switches and routers, which could cause a failure. However, while such a configuration may improve reliability and availability in the system, it does not address serviceability and still contains single points of failure. For example, failure of either of the server network interfaces to which the crossover cable is connected will cause the network to be unavailable. Also, failure of the crossover cable itself would result in unavailability of the network. In some cases, mechanisms may be put in place which detect an error in a network connection and notify the system administrator that a problem exists. The system administrator may then take corrective action, such as switching to a redundant resource. However, such mechanisms typically take some period of time and necessarily involve interruptions in network operation. In other cases, operating system specific mechanisms may be implemented which may facilitate a failover to a redundant connection. Typically these mechanisms operate at layers below the application layer of the protocol stack. Two widely recognized protocols include TCP/IP and ISO/OSI, each of which include a highest layer referred to as the application layer. Other communication protocols with a layer corresponding to the application layer may utilize a different name. Generally, those layers below the application layer involve software and mechanisms which are not portable across different operating systems. Consequently, these solutions are not portable and generally require a newly created mechanism for each platform on which a failover is desired.
One technology which provides for redundancy in case of failure is alternate pathing. Alternate pathing is a technology which provides for redundancy to storage in case of a failed I/O controller. In addition to providing for recovery after failure, alternate pathing may also be used to support dynamic reconfiguration. Dynamic reconfiguration is used to logically attach and detach system boards from a running operating system. In addition to providing redundancy to storage, alternate pathing may also be used with network connections. However, alternate pathing does not support automatic failover for network connections. Consequently, the problems described above still remain.
SUMMARY OF THE INVENTION
The problems outlined above are in large part solved by a method and mechanism as described herein. A method and mechanism of failover in a system with alternate pathing is described. By utilizing an Application layer mechanism which monitors the primary network connection, automatically detects a failure in the primary connection, and switches to the secondary connection in a short period of time, network availability may be maintained. Advantageously, network interruptions may be minimized and servicing of network problems may be automated by a mechanism which is portable across multiple platforms. Further, because the mechanism operates within the application layer of the communication protocol, no modification of existing operating software is necessary.
Broadly speaking, a method for maintaining high availability in a two node computer network utilizing alternate pathing is contemplated. The method includes adding an Application layer High Availability Networking (HAnet) mechanism to a node of the computer network, monitoring a first network connection, detecting a failure of the first network connection, and performing a failover from the first network connection to the second network connection. The monitoring, failure detection, and failover are all performed by the HAnet mechanism.
Also contemplated is a network node configured to support alternate pathing which includes a first network interface, a second network interface, and a High Availability Networking (HAnet) mechanism. The included HAnet mechanism operates at the Application layer and is configured to monitor the first network interface. If a failure of the first network interface is detected, the HAnet mechanism is configured to perform a failover from the first network interface to the second network interface.
Further contemplated is a two node computer network configured to support alternate pathing and to maintain high availability. The network includes a first node coupled to a second node by two paths. The first node includes a High Availability Networking (HAnet) mechanism which operates at the Application layer. The HAnet mechanism is configured to monitor the first path and perform a failover from the first path to the alternate path in response to detecting a failure of the first path.
REFERENCES:
patent: 4607256 (1986-08-01), Henzel
patent: 5774660 (1998-06-01), Brendel et al.
patent: 5832197 (1998-11-01), Houji
patent: 5917997 (1999-06-01), Bell et al.
patent: 5935215 (1999-08-01), Bell et al.
patent: 5948108 (1999-09-01), Lu et al.
patent: 5951650 (1999-09-01), Bell et al.
patent: 6052733 (2000-04-01), Mahalingam et al.
patent: 6065073 (2000-05-01), Booth
patent: 6173411 (2001-01-01), Hirst et al.
patent: 6243825 (2001-06-01), Gamache et al.
patent: 6275470 (2001-08-01), Ricciulli
patent: 6308282 (2001-10-01), Huang et al.
patent: 6314525 (2001-11-01), Mahalingham et al.
patent: 6324161 (2001-11-01), Kirch
patent: 6366558 (2002-04-01), Howes et al.
patent: 6389448 (2002-05-01), Primak et al.
patent: 6393485 (2002-05-01), Chao et al.
patent: 6430622 (2002-08-01), Aiken, Jr. et al.
patent: 6438705 (2002-08-01), Chao et al.
Mingus, Larry; “How-to: Simple 2 Computer Network wo/hub”; July 10, 1998; http://www.makeitsimple.com/how-to/simple.htm.*
“Fast Cluster Failover Using Virtual Memory-Mapped Communication,” Zhou, et al, Proceedings of the 13thACM International Conference on Supercomputing, 1999.
“Sun Trunking 1.2,” Sun Micro
Baderman Scott
Lohn Joshua
Meyertons Hood Kivlin Kowert & Goetzel P.C.
Rankin Rory D.
Sun Microsystems Inc.
LandOfFree
High availability networking with alternate pathing failover does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with High availability networking with alternate pathing failover, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and High availability networking with alternate pathing failover will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3206516