Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability
Reexamination Certificate
2005-09-07
2008-08-26
McCarthy, Christopher S (Department: 2113)
Error detection/correction and fault detection/recovery
Data processing system error or fault handling
Reliability and availability
C714S035000
Reexamination Certificate
active
07418630
ABSTRACT:
A method for safepointing a system that includes receiving a stop command by an executing thread from a master, wherein the executing thread executes an operating system, continuing execution of the executing thread until a safepoint is reached after receiving the stop command, halting execution of the executing thread at the safepoint; and evaluating a response from the executing thread to diagnosis the system.
REFERENCES:
patent: 6523059 (2003-02-01), Schmidt
patent: 6842853 (2005-01-01), Bush et al.
patent: 7013454 (2006-03-01), Bush et al.
patent: 7080374 (2006-07-01), Dahlstedt et al.
patent: 7086053 (2006-08-01), Long et al.
patent: 7114097 (2006-09-01), McLamb et al.
patent: 2001/0054057 (2001-12-01), Long et al.
patent: 2004/0148603 (2004-07-01), Baylis
patent: 2005/0289414 (2005-12-01), Adya et al.
patent: 2006/0294435 (2006-12-01), Vick et al.
patent: 2007/0136402 (2007-06-01), Grose et al.
Agarwal, Saurabh, Garg, Rahul, Gupta, Meeta S., and Moreira, Jose E., “Adaptive Incremental Checkpointing for Massively Parallel Systems”, ICS 2004, 10 pages.
Petrini, Fabrizio, Davis, Kei and Sancho, Jose Carlos, “System-Level Fault-Tolerance in Large-Scale Parallel Machines with Buffered Coscheduling”, 18th International Parallel and Distributed Processing Symposium (IPDPS'04), Apr. 26-30, 2004, 8 pages.
Elnozahy, E. N., Plank, J. S. and Fuchs, W.K., “Checkpointing for Peta-Scale Systems: A Look into the Future of Practical Rollback-Recovery”, IEEE Transactions on Dependable and Secure Computing, vol. 1, No. 2, Apr.-Jun. 2004, 12 pages.
Bronevetsky, Greg, Marques, Daniel, Pingalli, Keshav and Stodghill, Paul, “Automated Application-level Checkpointing of MPI Programs”, Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programminng, 2003, 11 pages.
Tuthill, B., Johnson, K., Wilkening, S. and Roe, D. “IRIX Checkpoint and Restart Operation Guide”, Silicon Graphics, Inc. Mountain View, Ca, 1999, 63 pages.
Plank, James S., “An Overview of Checkpointing in Uniprocessor and Distributed Systems, Focusing on Implementation and Performance”, Technical Report of University of Tennessee, Jul. 1997, 20 pages.
Kavanaugh, Gerard P. and Sanders, William H., “Performance Analysis of Two Time-Based Coordinated Checkpointing Protocols”, In 1997 Pac.Rim Int.Sympo. on Fault-Tol. Systmes, Dec. 1997, 8 pages.
Elnozahy, E.N., Alvisi, Lorenzo, Wang, Yi-Min and Johnson, David B., “A Survey of Rollback-Recovery Protocols in Message-Passing Systems”, Technical Report, Carnegie Mellon Univ., Oct. 1996, 42 pages.
Koo, Richard and Toueg, Sam, “Checkpointing and Rollback-Recovery for Distributed Systems”, IEEE Transactions on Software Engineering, vol. SE-13. No. 1, 1987, 9 pages.
Chandy, K. Mani and Lamport, Leslie, “Distributed Snapshots: Determining Global States of Distributed Systems”, In ACM Transactions on Computer Systems, 3(1), Aug. 1985, 13 pages.
Vick Christopher A.
Votta Lawrence G.
McCarthy Christopher S
Osha-Liang LLP
Sun Microsystems Inc.
LandOfFree
Method and apparatus for computer system diagnostics using... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for computer system diagnostics using..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for computer system diagnostics using... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3996650