Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability
Reexamination Certificate
2000-06-09
2001-12-18
Beausoleil, Robert (Department: 2181)
Error detection/correction and fault detection/recovery
Data processing system error or fault handling
Reliability and availability
C714S041000, C710S120000, C370S217000, C370S220000
Reexamination Certificate
active
06332198
ABSTRACT:
BACKGROUND
A major concern for service providers is network downtime. In pursuit of “five 9's availability” or 99.999% network up time, service providers must minimize network outages due to equipment (i.e., hardware) and all too common software failures. Developers of computer systems often use redundancy measures to minimize downtime and enhance system resiliency. Redundant designs rely on alternate or backup resources to overcome hardware and/or software faults. Ideally, the redundancy architecture allows the computer system to continue operating in the face of a fault with minimal service disruption, for example, in a manner transparent to the service provider's customer.
Generally, redundancy designs come in two forms: 1:1 and 1:N. In a so-called “1:1 redundancy” design, a backup element exists for every active or primary element (i.e., hardware backup). In the event that a fault affects a primary element, a corresponding backup element is substituted for the primary element. If the backup element has not been in a “hot” state (i.e., software backup), then the backup element must be booted, configured to operate as a substitute for the failing element, and also provided with the “active state” of the failing element to allow the backup element to take over where the failed primary element left off. The time required to bring the software on the backup element to an “active state” is referred to as synchronization time. A long synchronization time can significantly disrupt system service, and in the case of a computer network device, if synchronization is not done quickly enough, then hundreds or thousands of network connections may be lost which directly impacts the service provider's availability statistics and angers network customers.
To minimize synchronization time, many 1:1 redundancy schemes support hot backup of software, which means that the software on the backup elements mirror the software on the primary elements at some level. The “hotter” the backup element—that is, the closer the backup mirrors the primary—the faster a failed primary can be switched over or failed over to the backup. The “hottest” backup element is one that runs hardware and software simultaneously with a primary element conducting all operations in parallel with the primary element. This is referred to as a “1+1 redundancy” design and provides the fastest synchronization.
Significant costs are associated with 1:1 and 1+1 redundancy. For example, additional hardware costs may include duplicate memory components and printed circuit boards including all the components on those boards. The additional hardware may also require a larger supporting chassis. Space is often limited, especially in the case of network service providers who may maintain hundreds of network devices. Although 1:1 redundancy improves system reliability, it decreases service density and decreases the mean time between failures. Service density refers to the proportionality between the net output of a particular device and its gross hardware capability. Net output, in the case of a network device (e.g., switch or router), might include, for example, the number of calls handled per second. Redundancy adds to gross hardware capability but not to the net output and, thus, decreases service density. Adding hardware increases the likelihood of a failure and, thus, decreases the mean time between failures. Likewise, hot backup comes at the expense of system power. Each active element consumes some amount of the limited power available to the system. In general, the 1+1 or 1:1 redundancy designs provide the highest reliability but at a relatively high cost. Due to the importance of network availability, most network service providers prefer the 1+1 redundancy design to minimize network downtime.
In a 1:N redundancy design, instead of having one backup element per primary element, a single backup element or spare is used to backup multiple (N) primary elements. As a result, the 1:N design is generally less expensive to manufacture, offers greater service density and better mean time between failures than the 1:1 design and requires a smaller chassis/less space than a 1:1 design. One disadvantage of such a system, however, is that once a primary element fails over to the backup element, the system is no longer redundant (i.e., no available backup element for any primary element). Another disadvantage relates to hot state backup. Because one backup element must support multiple primary elements, the typical 1:N design provides no hot state on the backup element leading to long synchronization times and, for network devices, the likelihood that connections will be dropped and availability reduced.
Even where the backup element provides some level of hot state backup it generally lacks the processing power and memory to provide a full hot state backup (i.e., 1+N) for all primary elements. To enable some level of hot state backup for each primary element, the backup element is generally a “mega spare” equipped with a more powerful processor and additional memory. This requires customers to stock more hardware than in a design with identical backup and primary elements. For instance, users typically maintain extra hardware in the case of a failure. If a primary fails over to the backup, the failed primary may be replaced with a new primary. If the primary and backup elements are identical, then users need only stock that one type of board, that is, a failed backup is also replaced with the same hardware used to replace the failed primary. If they are different, then the user must stock each type of board, thereby increasing the user's cost.
SUMMARY
The present invention provides network managers with maximum flexibility in choosing redundancy schemes for their network devices. A network device of the present invention includes separate universal port cards (i.e., printed circuit boards or modules) for interfacing to physical network connections or ports, separate forwarding cards for performing network data processing and separate cross-connection cards for interconnecting universal port cards and forwarding cards. Separating the universal port cards and forwarding cards enables any path on any port on a universal port card to be connected to any port on a forwarding card through one or more cross-connection cards. In addition, the present invention provides a method and apparatus for allowing multiple redundancy schemes in a single network device. In one network device, the network manager may provide various redundancy schemes including 1:1, 1+1, 1:N, no redundancy or a combination of redundancy schemes for universal port cards (or ports) and forwarding cards, and the redundancy scheme or schemes for the universal port cards (or ports) may be the same as or different from the redundancy scheme or schemes for the forwarding cards. For example, a network manager may want to provide 1:1 or 1+1 redundancy for all universal port cards (or ports) but only 1:N redundancy for each N group of forwarding cards. As another example, the network manager may provide certain customers with 1:1 or 1+1 redundancy on both the universal port cards (or ports) and forwarding cards to ensure that customer's network availability while providing other customers, with lower availability requirements, with various other combinations of redundancy schemes, for example, 1:1, 1+1, 1:N or no redundancy for universal port cards (or ports) and 1:N or no redundancy for forwarding cards. The present invention allows customers with different availability/redundancy needs to be serviced by same network device.
The present invention provides network managers with maximum flexibility in choosing redundancy schemes for their network devices. A network device of the present invention includes separate universal port cards (i.e., printed circuit boards or modules) for interfacing to physical network connections or ports, separate forwarding cards for performing network data processing and separate cross-c
Branscomb Brian
Fox Barbara A.
Kidder Joseph D.
Langrind Nicholas A.
Noel Chris R.
Beausoleil Robert
Davis, Esq. Patricia
Engellenner, Esq. Thomas J.
Equipe Communications Corporation
Mollaaghababa, Esq. Reza
LandOfFree
Network device for supporting multiple redundancy schemes does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Network device for supporting multiple redundancy schemes, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Network device for supporting multiple redundancy schemes will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2573358