Method and apparatus for managing redundant computer-based...

Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C714S011000

Reexamination Certificate

active

06178522

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to computing environments, more particularly, it relates to a method for managing redundant computer-based systems for fault-tolerant computing.
2. Background of the Invention
Fault tolerant computing assures correct computing results in the existence of faults and errors in a system. The use of redundancy is the primary method for fault tolerance. There are many different ways of managing redundancy in hardware, software, information and time. Due to various algorithms and implementation approaches, most current systems use proprietary design for redundancy management, and these designs are usually interwoven with application software and hardware. The interweaving of the application with the redundancy management creates a more complex system with significantly decreased flexibility.
SUMMARY OF THE INVENTION
It is therefore an object of the present invention to provide a method for managing a redundant computer-based systems that is not interwoven with the application, and provides additional flexibility in the distributed computing environment.
According to an embodiment of the present invention, the redundant computing system is constructed by using multiple hardware computing nodes and installing a redundancy management system (RMS) in each individual node in a distributed environment.
The RMS is a redundancy management methodology implemented through a set of algorithms, data structures, operation processes and design applied through processing units in each computing system. The RMS has wide application in many areas that require high systems dependability such as aerospace, critical control systems, telecommunications, computer networks, etc.
To implement the RMS, it is separated, physically or logically, from the application development. This reduces the overall design complexity of the system at hand. As such, the system developer can design applications independently and can rely on the RMS to provide redundancy management functions. The RMS and application integration is accomplished by a programmable bus interface protocol which connects the RMS to application processors.
The RMS includes a Cross Channel Data Link (CCDL) module and a Fault Tolerant Executive (FTE) module. The CCDL module provides data communication between all nodes while the FTE module performs system functions such as synchronization, voting, fault and error detection, isolation and recovery. System fault tolerance is achieved by detecting and masking erroneous data through voting, and system integrity is ensured by a dynamically reconfigurable architecture that is capable of excluding faulty nodes from the system and re-admitting healthy nodes back into the system.
The RMS can be implemented in hardware, software, or a combination of both (i.e., hybrid) and works with a distributed system which has redundant computing resources to handle component failures. The distributed system can have two to eight nodes depending upon system reliability and fault tolerance requirements. A node consists of a RMS and an application processor(s). Nodes are interconnected together through the RMS's CCDL module to form a redundant system. Since individual applications within a node do not have full knowledge of other node's activities, the RMSs provide system synchronization, maintain data consistency, and form a system-wide consensus of faults and errors occurring in various locations in the system.


REFERENCES:
patent: 4503535 (1985-03-01), Budde et al.
patent: 4575842 (1986-03-01), Katz et al.
patent: 4583224 (1986-04-01), Ishii et al.
patent: 4634110 (1987-01-01), Julich et al.
patent: 4817091 (1989-03-01), Katzman et al.
patent: 4847837 (1989-07-01), Morales et al.
patent: 4907232 (1990-03-01), Harper
patent: 4914657 (1990-04-01), Walter et al.
patent: 4933838 (1990-06-01), Elrod
patent: 5068499 (1991-11-01), Mutone
patent: 5173689 (1992-12-01), Kusano
patent: 5261085 (1993-11-01), Lamport
patent: 5271014 (1993-12-01), Bruck et al.
patent: 5280607 (1994-01-01), Bruck et al.
patent: 5325518 (1994-06-01), Bianchini, Jr.
patent: 5349654 (1994-09-01), Bond et al.
patent: 5450578 (1995-09-01), Mackenthum
patent: 5463615 (1995-10-01), Steinhorn
patent: 5473771 (1995-12-01), Burd et al.
patent: 5513313 (1996-04-01), Bruck et al.
patent: 5533188 (1996-07-01), Palumbo
patent: 5561759 (1996-10-01), Chen
patent: 5684807 (1997-11-01), Bianchini, Jr. et al.
patent: 5689632 (1997-11-01), Galy et al.
patent: 5736933 (1998-04-01), Segal
patent: 5764882 (1998-06-01), Shingo
patent: 5790397 (1998-08-01), Bissett et al.
J. Zhou, “Design Capture for System Dependability,” Proc. Complex Systems Engineering Synthesis and Assessment Workshop, NSWC, Silver Spring, MD, Jul. 1992, pp 107-119.
P. Thambidurai, A.M. Finn, R.M. Kieckhafer, and C.J. Walter, “Clock Synchronization in MAFT,” Proc. IEEE 19thInternational Symposium on Fault-Tolerant Computing, 1989, pp 142-149.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for managing redundant computer-based... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus for managing redundant computer-based..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for managing redundant computer-based... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2496772

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.