Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability
Patent
1997-01-28
1999-04-06
Lim, Krisna
Error detection/correction and fault detection/recovery
Data processing system error or fault handling
Reliability and availability
714 10, G06F 1100
Patent
active
058928952
ABSTRACT:
A method and apparatus for detecting and tolerating situations in which one or more processors in a multi-processor system cannot participate in timer-driven or timer-triggered protocols or event sequences. The multi-processor system includes multiple processors each having a respective memory. These processors are coupled by an inter-processor communication network (preferably consisting of redundant paths).
Processors are suspected of having failed (ceased operations) outright or having a failed timer mechanism when other processors detect the absence of periodic "IamAlive" messages from other processors. When this happens, all of the processors in the system are subjected to a series of stages in which they repeatedly broadcast their status and their connectivity to each other. During the first such stage, according to the present invention, a processor will not assert its ability to participate unless its timer mechanism is working. It arms a timer expiration event and does not assert its health until and unless that timer expiration event occurs.
REFERENCES:
patent: 4868818 (1989-09-01), Madan et al.
patent: 4879716 (1989-11-01), McNally et al.
patent: 5473771 (1995-12-01), Burd et al.
patent: 5649092 (1997-07-01), Price et al.
patent: 5687308 (1997-11-01), Jardine et al.
patent: 5809223 (1998-09-01), Lee et al.
Flaviu Cristian et al.; Automatic Service Availability Management in Asynchronous Distributed Systems; Configurable Distributed system, 1994 Int'l Workshop; pp. 58-68, Jun. 1994.
Flaviu Cristian et al.; Autonomous Decentralized Systems, 1993 Int'l Symp.; pp. 360-366, Feb. 1993.
Copy of International Search Report for PCT/US98/01484 dated Aug. 4, 1998.
Basavaiah Murali
Krishnakuma Karoor S.
Murthy Srinivasa D.
Coulter Kenneth R.
Lim Krisna
Tandem Computers Incorporated
LandOfFree
Method an apparatus for tolerance of lost timer ticks during rec does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method an apparatus for tolerance of lost timer ticks during rec, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method an apparatus for tolerance of lost timer ticks during rec will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-1378953