Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability
Reexamination Certificate
2007-04-17
2007-04-17
Baderman, Scott (Department: 2114)
Error detection/correction and fault detection/recovery
Data processing system error or fault handling
Reliability and availability
C714S012000, C714S006130
Reexamination Certificate
active
10651757
ABSTRACT:
A method and mechanisms for checkpointing objects, processes and other components of a multithreaded application program, based on the leader-follower strategy of semi-active or passive replication, where it is not possible to stop and checkpoint all of the threads of the object, process or other component simultaneously. Separate checkpoints are generated for the local state of each thread and for the data that are shared between threads and are protected by mutexes. The invention enables different threads to be checkpointed at different times in such a way that the checkpoints restore a consistent state of the threads between the existing replicas and a new or recovering replica, even though the threads operate concurrently and asynchronously. The checkpoint of the shared data is piggybacked onto regular messages along with ordering information that determines the order in which the mutexes are granted to the threads.
REFERENCES:
patent: 5257381 (1993-10-01), Cook
patent: 5440726 (1995-08-01), Fuchs et al.
patent: 5794034 (1998-08-01), Harinarayan et al.
patent: 5799146 (1998-08-01), Badovinatz et al.
patent: 5802265 (1998-09-01), Bressoud et al.
patent: 5802267 (1998-09-01), Shirakihara et al.
patent: 5941999 (1999-08-01), Matena et al.
patent: 5956489 (1999-09-01), San Andres et al.
patent: 6338147 (2002-01-01), Meth et al.
patent: 6539446 (2003-03-01), Chan
patent: 6928577 (2005-08-01), Moser et al.
patent: 2002/0032883 (2002-03-01), Kampe et al.
patent: 2005/0229035 (2005-10-01), Peleska et al.
Stallings, William; Operating Systems, Third Edition, published 1998, pp. 72, 276-277.
Y. Huang and C.M.R. Kintala; Software Implemented Fault Tolerance: Technologies and Experience, Proceedings of the IEEE 23rdInternational Symposium on Fault-Tolerant Computing, Toulouse, France, Jun. 1993, pp. 2-9.
J. Srouji, P. Schuster, M. Bach and Y. Kuzmin; A Transparent Checkpoint Facility on NT, Proceedings of the 2ndUSENIX Windows NT Symposium, Seattle, WA, Aug. 1998, pp. 77-85.
M. Kasbekar and C.R. Das; Selective Checkpointing and Rollbacks in Multithreaded Distributed Systems, Proceedings fo the IEEE 21stInternational Conference on Distributed Computing Systems, Mesa, AZ, Apr. 2001, pp. 39-46.
W. R. Dieter and J. E. Lumpp, Jr.; User-level checkpointing for LinuxThreads programs, Proceedings of the FREENIX Track, USENIX Annual Technical Conference, Boston, MA, Jun. 2001, pp. 81-92.
Melliar-Smith Peter M.
Moser Louise E.
Availigent Inc.
Baderman Scott
Lohn Joshua
O'Banion John P.
LandOfFree
Consistent asynchronous checkpointing of multithreaded... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Consistent asynchronous checkpointing of multithreaded..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Consistent asynchronous checkpointing of multithreaded... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3755067