Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability
Reexamination Certificate
2000-04-13
2003-10-14
Beausoliel, Robert (Department: 2184)
Error detection/correction and fault detection/recovery
Data processing system error or fault handling
Reliability and availability
C370S228000, C714S043000
Reexamination Certificate
active
06633996
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to fault-tolerant computer systems and more particularto a dedicated maintenance bus for use with such computer systems.
2. Background Information
Fault-tolerant computer systems are employed in situations and environments that demand high reliability and minimal downtime. Such computer systems may be employed in the tracking of financial markets, the control and routing of telecommunications and in other mission-critical functions such as air traffic control.
A common technique for incorporating fault-tolerance into a computer system is to provide a degree of redundancy to various components. In other words, important components are often paired with one or more backup components of the same type. As such, two or more components may operate in a so-called lockstep mode in which each component performs the same task at the same time, while only one is typically called upon for delivery of information. Where data collisions, race conditions and other complications may limit the use of lockstep architecture, redundant components may be employed in failover mode. In failover mode, one component is selected as a primary component that operates under normal circumstances. If a failure in the primary component is detected, then the primary component is bypassed and the secondary (or tertiary) redundant component is brought on line. A variety of initialization and switchover techniques are employed to make a transition from one component to another during runtime of the computer system. A primary goal of these techniques is to minimize downtime and corresponding loss of function and/or data.
Fault-tolerant computer systems are often costly to implement since many commercially available components are not specifically designed for use in redundant systems. It is desirable to adapt conventional components and their built-in architecture whenever possible. All modem computer systems have particular capabilities directed to control and monitoring of functions. For example, large microprocessor chips such as the Pentium III™, available from Intel Corporation of Santa Clara, Calif., are designed to operate within a specific temperature range that is monitored by a commercially availble environmental/temperature-sensing chip. One technique for interconnecting such an environmental monitor or other monitoring and control devices is to utilize a dedicated maintenance bus. The maintenance bus is typically separate system's main data and control bus structure. The maintenance bus generally connects to a single, centralized point of control, often implemented as a peripheral component interconnect (PCI) device.
However, as discussed above, conventional maintenance bus architecture is not specifically designed for redundant operation. Accordingly, prior fault-tolerant systems have utilized a customized architecture for transmitting monitor and control signals over the system's main buses (or dedicated proprietary buses) using, for example, a series of application specific integrated circuits (ASICs) mounted on each circuit board being monitored. To take advantage of current, commercially available maintenance bus architecture in a fault tolerant computing environment, a more comprehensive and costeffective approach is needed.
Accordingly, it is an object of this invention to provide maintenance bus architecture having a high degree of fault-tolerance. This maintenance bus architecture should be interoperable with commercially available components and should allow a fairly high degree of versatility in terms of monitoring and control of important computer system components.
SUMMARY OF THE INVENTION
This invention overcomes the disadvantages of the prior art by providing a fault-tolerant maintenance bus architecture that includes two maintenance buses interconnecting each of a plurality of printed circuit boards, termed “parent” circuit boards. The two maintenance buses are each connected to a pair of system management modules (SMMs) that are configured to perform a variety of maintenance bus activities. The SMM can comprise any acceptable device for driving commands on the maintenance bus arrangement. Within each parent board are a pair of redundant bridges both having a unique address. One bridge is connected to the first maintenance bus while a second bridge is connected to the second maintenance bus of the pair. A child maintenance bus interconnects the two bridges through a “child” printed circuit board. The introduction of a separate board to implement the child maintenance bus can be useful, but is not essential according to this invention. The child maintenance bus is itself interconnected with a variety of monitor and control functions on maintenance bus-compatible subsystem components. The SMMs can address components on each child printed circuit board individually and receive appropriate responses therefrom. In the event of a bus or bridge failure, the SMM can still communicate with the child subsystem components via the redundant bus and bridge.
The bridge can include an interconnection to a further bridge. This remote bridge can, itself, be interconnected to additional microprocessors and associated memory. The remote bridge is addressed through one of the parent board's bridges so the communication to and from the SMM can occur. The SMM can be interconnected with a variety of other computer system peripherals and components, and can be accessed over a local network or through an Internet-based communication network.
REFERENCES:
patent: 3544973 (1970-12-01), Borck, Jr. et al.
patent: 3548176 (1970-12-01), Shuttler
patent: 3641505 (1972-02-01), Artz et al.
patent: 3710324 (1973-01-01), Cohen et al.
patent: 3736566 (1973-05-01), Anderson et al.
patent: 3795901 (1974-03-01), Boehm et al.
patent: 3805039 (1974-04-01), Stiffler
patent: 3820079 (1974-06-01), Bergh et al.
patent: 3840861 (1974-10-01), Amdahl et al.
patent: 3997896 (1976-12-01), Cassarino, Jr. et al.
patent: 4015246 (1977-03-01), Hopkins, Jr. et al.
patent: 4032893 (1977-06-01), Moran
patent: 4059736 (1977-11-01), Perucca et al.
patent: 4128883 (1978-12-01), Duke et al.
patent: 4228496 (1980-10-01), Katzman et al.
patent: 4245344 (1981-01-01), Richter
patent: 4263649 (1981-04-01), Lapp, Jr.
patent: 4275440 (1981-06-01), Adams, Jr. et al.
patent: 4309754 (1982-01-01), Dinwiddie, Jr. et al.
patent: 4366535 (1982-12-01), Cedolin et al.
patent: 4434463 (1984-02-01), Quinquis et al.
patent: 4449182 (1984-05-01), Rubinson et al.
patent: 4453215 (1984-06-01), Reid
patent: 4467436 (1984-08-01), Chance et al.
patent: 4484273 (1984-11-01), Stiffler et al.
patent: 4486826 (1984-12-01), Wolff et al.
patent: 4503496 (1985-03-01), Holzner et al.
patent: 4543628 (1985-09-01), Pomfret
patent: 4590554 (1986-05-01), Glazer et al.
patent: 4597084 (1986-06-01), Dynneson et al.
patent: 4608631 (1986-08-01), Stiffler et al.
patent: 4628447 (1986-12-01), Cartret et al.
patent: 4630193 (1986-12-01), Kris
patent: 4633394 (1986-12-01), Georgiou et al.
patent: 4654857 (1987-03-01), Samson et al.
patent: 4669056 (1987-05-01), Waldecker et al.
patent: 4669079 (1987-05-01), Blum
patent: 4700292 (1987-10-01), Campanini
patent: 4703420 (1987-10-01), Irwin
patent: 4750177 (1988-06-01), Hendrie et al.
patent: 4805091 (1989-02-01), Thiel et al.
patent: 4809169 (1989-02-01), Sfarti et al.
patent: 4816990 (1989-03-01), Williams
patent: 4827409 (1989-05-01), Dickson
patent: 4866604 (1989-09-01), Reid
patent: 4869673 (1989-09-01), Kreinberg et al.
patent: 4914580 (1990-04-01), Jensen et al.
patent: 4916695 (1990-04-01), Ossfeldt
patent: 4926315 (1990-05-01), Long et al.
patent: 4931922 (1990-06-01), Baty et al.
patent: 4939643 (1990-07-01), Long et al.
patent: 4974144 (1990-11-01), Long et al.
patent: 4974150 (1990-11-01), Long et al.
patent: 4985830 (1991-01-01), Atac et al.
patent: 4994960 (1991-02-01), Tuchler et al.
patent: 5005174 (1991-04-01), Bruckert et al.
patent: 5083258 (1992-01-01), Yamasaki
patent: 5099485 (1992-03-01), Bruckert et al.
patent: 5117486 (1992-0
Amato Joseph S.
Joyce Paul
Suffin A. Charles
Beausoliel Robert
Chu Gabriel
Stratus Technologies Bermuda Ltd.
Testa Hurwitz & Thibeault LLP
LandOfFree
Fault-tolerant maintenance bus architecture does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Fault-tolerant maintenance bus architecture, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Fault-tolerant maintenance bus architecture will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3116757