Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability
Reexamination Certificate
1998-08-26
2001-05-08
Lim, Krisna (Department: 2153)
Error detection/correction and fault detection/recovery
Data processing system error or fault handling
Reliability and availability
C709S242000
Reexamination Certificate
active
06230281
ABSTRACT:
FIELD OF THE INVENTION
The invention pertains to communications networks. More particularly, the invention pertains to a method and apparatus for providing back-up redundancy to keep a network in full operation when a manager node on the network becomes disabled.
BACKGROUND OF THE INVENTION
A communications network typically comprises a plurality of network elements which conduct communications over the network. Using a local area network (LAN) of a business office as an example, a personal computer (PC) sitting on someone's desk is a network element. It communicates with other network elements to exchange data, such as communicating with another desktop PC via interoffice e-mail or retrieving a word processing document from a data server on the network.
The network also includes element managers
4
a
-
4
c,
the function of which is to control communications between the network elements on the network and are generally invisible to the user of a network element. Each element manager is responsible for controlling a subset of the network elements. In large networks, there may be an even higher control node, termed a network manager, which is in communication with the element managers and generally acts as a manager for the element managers.
The present application is primarily concerned with these larger type networks in which a plurality of element managers each control a plurality of network elements. In such networks, it is frequently desirable to have some type of back-up system to allow network elements to continue to operate even if the element manager which is responsible for controlling the element cannot do so, for instance, due to the manager becoming disabled or due to a fault in the communication path between the manager and the agent.
In one known redundancy back-up scheme, all of the hardware of the manager and/or the data required by the manager for proper operation of the network is duplicated. Thus, if the primary hardware becomes disabled, the secondary hardware simply takes over and keeps the element manager in operation. Such schemes are typically extremely limited in how far apart the two sets of hardware can be from each other due at least to cabling requirements.
One problem with this prior art back-up scheme is that the back-up hardware system is essentially in the same location as the primary system. Accordingly, they cannot offer protection in situations where the cause of the disablement of the primary system is an external force which effects the entire locale. Examples of such events include fire, natural disaster, insurrection and other wartime calamities. Such events are of particular concern in developing nations.
Another known scheme involves having duplicate hardware at a remote location and replicating part of the application data over a high speed link. If the hardware at the primary location fails, the secondary hardware at the remote location can take over using the replicated data. Such schemes suffer from the need of a costly high speed data link between the primary hardware and the remote backup hardware. Also, this type of backup scheme is only possible with limited types of networks.
Accordingly, it is an object of the present invention to provide an improved communications network.
It is another object of the present invention to provide an improved back-up scheme for a communications network.
It is yet another object of the present invention to provide a back-up scheme for a communications network wherein the back-up hardware is at a geographically distant location from the primary hardware.
It is a further object of the present invention to provide a remote geographic redundancy scheme for shifting control of network elements from a disabled network manager to one or more other element manager sites.
SUMMARY OF THE INVENTION
The invention is a redundant control scheme for keeping channels of communication with a network element open even when the element manager node that has primary responsibility for controlling communications with that network element is disabled. In particular, each element manager is responsible for controlling one or more network elements. The collection of network elements for which a manager is responsible is termed that manager's domain. A manager's domain comprises two sub-domains (herein all domains and sub-domains are generically referred to as “domains”), namely a primary domain and a secondary domain. A manager's primary domain comprises the network elements for which that element manager has primary responsibility. A manager's secondary domain comprises network elements for which one or more other element managers have primary responsibility, but for which the manager will assume responsibility in the event that the primary manager of that network element becomes disabled. A domain may comprise a geographic area. The primary domain is further broken down into a protected primary domain and a not-protected domain. The protected primary domain comprises all network elements which are participating in the geographic redundancy scheme of the present invention. The not-protected primary domain comprises all network elements which are not participating in the geographic redundancy scheme.
Every network element participating in the geographic redundancy scheme of the present invention has one primary manager and one secondary manager. When an element manager cannot control one or more of the network elements for which it is primarily responsible, the element managers that are secondary managers for those one or more network elements detect this situation through one of several possible mechanisms. For example, every secondary manager is equipped to poll at fixed intervals the primary manager or managers of all of the network elements in its secondary domain to determine if they are still operating. If the secondary manager detects that a primary manager has not responded to the polling for a predetermined period of time, it assumes that the non-responsive manager is not operating and attempts to gain control of the relevant network elements. The primary manager also may automatically request the secondary manager to assume control of a network element if it cannot communicate with one of its network elements. A control switch also can be effected manually through the primary manager.
Regardless of the mechanism by which a control switch to the secondary manager is initiated, the secondary manager attempts to gain control of the network elements in its secondary domain for which the disabled primary element manager was responsible by requesting the network element to recognize the secondary manager as its manager and to send the secondary manager a complete copy of its MIB data.
Prior to assuming control of a network element in its secondary domain, the only data stored at the secondary manager pertaining to that network element are 1) the identity of that network element's primary manager and 2) a copy of the primary element manager's network level data for the given network element.
The secondary managers are primary managers of other network elements in the network. Accordingly, very little, if any, additional hardware is employed to implement this redundancy scheme since the backup managers already are part of the network. Also, the secondary managers can be geographically remote from the primary managers, providing insurance against network failure in the event of failure events that effect entire geographic areas, such as natural disaster or insurrection.
REFERENCES:
patent: 4625082 (1986-11-01), Kelly
patent: 5796934 (1998-08-01), Bhanot et al.
patent: 6014753 (2000-01-01), Miyamoto et al.
patent: 0 147 046 A2 (1985-03-01), None
patent: 2 244 628 (1991-12-01), None
patent: 2 272 611 (1994-05-01), None
Brodfuhrer Russel E.
Cairns Shaun
Fleisch Michael P.
Gourley David George
Pearcey Simon Mark
Lim Krisna
Lucent Technologies - Inc.
Synnestvedt & Lechner LLP
LandOfFree
Geographic redundancy protection method and apparatus for a... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Geographic redundancy protection method and apparatus for a..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Geographic redundancy protection method and apparatus for a... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2453951