Synchronization of processors in a fault tolerant...

Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C714S010000

Reexamination Certificate

active

06223304

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Technical Field of the Invention
The present invention relates to general processing systems and, in particular, to a system and method for synchronizing processors in a fault tolerant multi-processor system.
2. Description of Related Art
Fault tolerant systems, such as, for example, fault tolerant computer systems, are used in real time systems which must be up and running 24 hours a day. These systems are normally implemented using two or more redundant processing units executing the same programs in synchronization. Special methods are used to keep the processing units in synchronization, to detect and localize faults, and to reintegrate replaced units. As special designed hardware often is required to accomplish these functions, fault tolerant systems are usually relatively complicated to design.
It would be advantageous if the design of fault tolerant systems could be simplified by using commercially available components, such as, for example, state of the art microprocessors and memory, not especially designed for fault tolerance, as often as possible. This would make the fault tolerant systems less expensive, easier to design and upgrade when faster compatible components become available.
One way to simplify the design of fault tolerant systems is to lessen the requirement for run-time synchronization between the processing units. This approach will simplify the interaction between the processing units and make it easier to use commercially available components in their design. However at the same time, it becomes more difficult to synchronize a processing unit that has been out of synchronization with other processing units, such as, for example, when a processing unit is replaced. The working processing unit has local memory where information on all executing programs is stored. This state related information includes data describing the state of each executing program (each executing program consisting of a number of executing processes), data variables used by each executing program, etc. The replaced processing unit has to get its local memory updated with this information from one of the working processing units before the replaced processing unit can be brought into parallel operation again.
A simple method to update a replaced processing unit's local memory is to temporarily stop the normal program execution in the working processing units, while copying all state related information from the working processing units to the replaced processing unit. However, this approach delays the normal program execution by an amount that is proportional to the amount of information that has to be updated, and the inverse of the available bandwidth of the communication channel between the processing units, which is used to copy that information. In most cases this would require a very high bandwidth communication channel in order not to cause a longer operational delay than can be accepted in a fault tolerant system.
Another method used to update a replaced processing unit is to keep executing the normal programs while a background process copies all state related information from the working processing units to the replaced processing unit. Any changes to state related information in the working processing units local memory made during the background copying process will be transferred to the replaced processing unit in real-time on a communication channel between the working processing units and the replaced processing unit. This approach requires the bandwidth of the communication channel between the processing units to meet the maximum frequency of the information changes in the executing programs, which again complicates the interaction between the processing units.
It is, therefore, one object of the present invention to provide a simplified, yet highly reliable design for a fault tolerant system. Another object of the present invention is to use commercially available components in the design of fault tolerant systems as often as possible. A further object of the present invention is to simplify the interaction between processing units in a fault tolerant system. Still another object is to provide an improved method of re-integrating replaced units into a fault tolerant system.
SUMMARY OF THE INVENTION
The present invention is directed to a system and method for synchronizing processors in a fault tolerant system, and more specifically, to a fault tolerant multi-processor system built of loosely coupled processor units with a low bandwidth communication link between the processor units.
The synchronization is accomplished by first, in a normal execution mode, copying a first part of state related information from a working processor to a processor out of synchronization, and then secondly, while normal execution is temporarily suspended, copying the remaining second part of state related information from the working processor unit to the processor out of synchronization.


REFERENCES:
patent: 4590554 (1986-05-01), Glazer
patent: 4757442 (1988-07-01), Sakata
patent: 5155678 (1992-10-01), Fukumoto et al.
patent: 5202980 (1993-04-01), Morita et al.
patent: 5235700 (1993-08-01), Alaiwan et al.
patent: 5295258 (1994-03-01), Jewett et al.
patent: 5404508 (1995-04-01), Konrad
patent: 5555371 (1996-09-01), Duyanovich
patent: 5737514 (1998-04-01), Stiffler
patent: 5828821 (1998-10-01), Hoshina
patent: 5968185 (1999-10-01), Bressoud
patent: 6023772 (2000-02-01), Fleming
patent: 0 306 348 (1989-03-01), None
patent: 0 414 379 (1991-02-01), None
patent: 0 433 979 (1991-06-01), None
patent: 0 626 647 A1 (1994-11-01), None
patent: 101876 (1998-10-01), None
patent: PCT/SE99/0110 (1999-11-01), None

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Synchronization of processors in a fault tolerant... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Synchronization of processors in a fault tolerant..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Synchronization of processors in a fault tolerant... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2516659

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.