Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability
Patent
1998-01-30
2000-02-15
Beausoliel, Jr., Robert W.
Error detection/correction and fault detection/recovery
Data processing system error or fault handling
Reliability and availability
714 15, 714 20, 714 25, 714 38, 714 19, 714 23, 714 11, 709105, 709104, 709224, 709226, G26F 1100
Patent
active
060264993
ABSTRACT:
A scheme for restarting processes at distributed checkpoints in a client-server computer system, in which a fault in one client computer does not affect the server computer and the other client computers. In this scheme, a fault occurring in one computer among of a plurality of computers constituting a client-server computer system is detected while these plurality of computers are executing respective processes, and whether that one computer in which the fault is detected is a server computer or not is judged. Then, related processes executed on these plurality of computers are restarted when that one computer is judged as the server computer, whereas no process executed on these plurality of computers is restarted when that one computer is not judged as the server computer. It is also possible to specify a restart information for each process indicating how processes should be restarted when a process fault occurs in each process, and to restart selected processes executed on these plurality of computers according to the restart information for a process in which the process fault is detected.
REFERENCES:
patent: 4665520 (1987-05-01), Strom et al.
patent: 5008786 (1991-04-01), Thatte
patent: 5301309 (1994-04-01), Sugano
patent: 5333303 (1994-07-01), Mohan
patent: 5333314 (1994-07-01), Masai et al.
patent: 5590277 (1996-12-01), Fuchs et al.
patent: 5634096 (1997-05-01), Baylor et al.
patent: 5754752 (1998-05-01), Sheh et al.
patent: 5796934 (1998-08-01), Bhanot et al.
patent: 5802267 (1998-09-01), Shirakihara et al.
patent: 5819019 (1998-10-01), Nelson
patent: 5819022 (1998-10-01), Bandat
patent: 5845082 (1998-12-01), Murakami
patent: 5845292 (1998-12-01), Bohannon et al.
patent: 5911040 (1999-06-01), Hirayama et al.
patent: 5922078 (1999-07-01), Hirayama et al.
patent: 5923832 (1999-07-01), Shirakihara et al.
patent: 5931954 (1999-08-01), Hoshina et al.
patent: 5948112 (1999-09-01), Shimada et al.
patent: 5951694 (1999-09-01), Choquier et al.
IEEE Publication to Bhargava et al. is cited for "Independent Checkpointing and Concurrent Rollback for recovery in Distributed system-An Optimistic Approach", 1988.
IEEE Transactions vol. SE-13 No. 1 pp. 23-31 is cited for "Checkpointing and Rollback-Recovery for Distributed System", Jan. 1987.
IEEE Publication to Leu et al. is cited for "Concurrent Robust Checkpointing and Recovery in Distributed Systems", 1988.
K. Mani Chandy, et al., "Distributed Snapshots: Determining Global States of Distributed Systems," ACM Transactions on Computer Systems, vol. 3, No. 1, (Feb. 1985), pp. 63-75.
Robert E. Strom, et al., "Optimistic Recovery in Distributed Systems," ACM Transactions on Computer Systems, vol. 3, No. 3, (Aug. 1985). pp. 204-266.
Hirayama Hideaki
Kanai Tatsunori
Sato Kiyoko
Shirakihara Toshio
Beausoliel, Jr. Robert W.
Hamdan Wasseem
Kabushiki Kaisha Toshiba
LandOfFree
Scheme for restarting processes at distributed checkpoints in cl does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Scheme for restarting processes at distributed checkpoints in cl, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Scheme for restarting processes at distributed checkpoints in cl will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-1915545