Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability
Reexamination Certificate
1999-04-19
2002-09-24
Iqbal, Nadeem (Department: 2184)
Error detection/correction and fault detection/recovery
Data processing system error or fault handling
Reliability and availability
C714S010000
Reexamination Certificate
active
06457138
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates to computer networks. In particular, the present invention relates to establishing processor redundancy.
BACKGROUND OF THE INVENTION
A network is a communication system that allows users to access resources on other computers and exchange messages with other users. A network is typically a data communication system that links two or more computers and peripheral devices. It allows users to share resources on their own systems with other network users and to access information on centrally located systems or systems that are located at remote offices. It may provide connections to the Internet or the networks of other organizations. The network typically includes a cable that attaches to network interface cards (NIC) in each of the devices within the network. Users may interact with network-enabled software applications to make a network request (such as to get a file or print on a network printer). The application may also communicate with the network software and network software then may interact with the network hardware to transmit information to other devices attached to the network.
An example of a network is a local area network (LAN). A LAN is a network that is located in a relatively small area, such as a department or building. A LAN typically includes a shared medium to which workstations attach and communicate with one another by using broadcast methods. With broadcasting, any device on the LAN can transmit a message that all other devices on the LAN can listen to. The device to which the message is addressed actually receives the message. Data is typically packaged into frames for transmission on the LAN.
FIG. 1
is a block diagram illustrating a network connection between a user
10
and a particular web page
20
. This figure is an example which may be consistent with any type of network, including a LAN, a wide are network (WAN), or a combination of networks, such as the Internet.
When a user
10
connects to a particular destination, such as a requested web page
20
, the connection from the user
10
to the web page
20
is typically routed through several routers
12
A-
12
D. Routers are intenetworking devices. They are typically used to connect similar and heterogeneous network segments into internetworks. For example, two LANs may be connected across a dial-up, integrated services digital network (ISDN), or a leased line via routers. Routers may also be found throughout the Internet. End users may connect to a local Internet service provider (ISP) (not shown), which are typically connected via routers to regional ISPs, which are in turn typically connected via routers to national ISPs.
If a router, such as router
12
C, fails and is no longer able to route the desired connection, then the desired connection between the user
10
the desired web page
20
may be significantly delayed or unable to connect at all. To avoid this problem, a solution has been implemented by router manufacturers, such as Cisco Systems, that includes two processors, a primary processor and a secondary processor, such that the secondary processor may take over as the main processor if the primary processor has either a hardware or software failure. Accordingly, such a solution provides redundancy to avoid failure of the router.
If the secondary processor is required to become the new primary processor, then the secondary processor typically reboots, establishes itself as the new primary processor, and reinitializes the entire router to become the new primary processor. The re-booting and reinitializing process can take a substantial amount of time, such as minutes, since software is typically reloaded from either the network or flash memory and the new primary processor needs to run through the router configuration. The router configuration typically controls how the router moves data traffic, and can be highly complex. The more complex the router configuration, the longer it typically takes to configure the router. Re-booting the router may take approximately 30 seconds to 5 minutes.
It would be desirable for a router to provide redundancy without a substantial amount of down time for re-booting. The present invention addresses such a need.
SUMMARY OF THE INVENTION
The present invention relates to providing processor redundancy in a system such as a router. According to an embodiment of the present invention, when a primary processor is about to crash in a system having two or more processors, the imminent crash is identified prior to the occurrence of the actual crash. The primary processor sends a message to the secondary processor to indicate that it is crashing. The primary also sets a timer to determine a period of time to wait prior to crashing. The timer may be used to set a time period prior to crashing in case the old primary processor does not receive an acknowledgement from the secondary processor. When the secondary processor receives the message from the primary processor, the secondary processor becomes the new primary processor. The new primary processor then sends an acknowledgement to the old primary processor. The old primary processor then crashes and reboots as the new secondary processor.
A method according to an embodiment of the present invention for handling a crash on a redundant processor system is presented. The method comprises determining that a primary processor is about to crash. The method also includes suspending the crash and sending a message to a secondary processor.
Another method according to an embodiment of the present invention for handling a crash on a redundant processor system is presented. The method comprises determining that a secondary processor is about to crash. The method also includes sending a message to a primary processor; sending crash information; and booting the secondary processor.
A system according to an embodiment of the present invention for handling a crash on a redundant processor is also presented. The system comprises a primary processor configured to identify a crash, wherein the identification is performed prior to the occurrence of the crash. The primary processor is also configured to suspend the crash and send a message to a secondary processor. The system also includes a memory coupled to the primary processor, wherein the memory is configured to provide instructions.
REFERENCES:
patent: 5136498 (1992-08-01), McLaughlin et al.
patent: 5157663 (1992-10-01), Major et al.
patent: 5455932 (1995-10-01), Major et al.
patent: 5790777 (1998-08-01), Izuta et al.
patent: 5919266 (1999-07-01), Sud et al.
patent: 5963448 (1999-10-01), Flood et al.
patent: 6023507 (2000-02-01), Wookey
patent: 6085244 (2000-07-01), Wookey
patent: 6263452 (2001-07-01), Jewett et al.
Jack Jenney, “Dual RSP—High System Availability SW Functional Spec”, Oct. 6, 1995, Cisco Systems, Inc.
Lesser Ofrit
May William
Moberg Kenneth
Cisco Technology Inc.
Iqbal Nadeem
Van Pelt & Yi LLP
LandOfFree
System and method for crash handling on redundant systems does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with System and method for crash handling on redundant systems, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for crash handling on redundant systems will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2875320