Main memory system and checkpointing protocol for fault-tolerant

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

395488, G06F 1100

Patent

active

057519390

ABSTRACT:
A mechanism for returning a computer system to a consistent, periodically updated state in main memory without constraining normal computer operation is provided, thereby enabling a computer system to recover from faults without loss of data or processing continuity. In a typical computer system, a processor and input/output elements are connected to a main memory subsystem that includes a primary memory. A checkpoint memory element which may include one or more buffer memories, including a read buffer and a write buffer, and an exclusive-or memory block, is also appended to this main memory subsystem. The exclusive-or memory block is a block of memory corresponding in size to one block of the primary memory that can fail as a unit. The exclusive-or memory block contains an exclusive-or of the contents of the primary memory at a previous checkpoint state. During normal processing, both or either a pre-image and/or a post image of data written to primary memory may be captured by the checkpoint memory element. In one embodiment, a write operation is converted to a read-then-write operation to store pre-image data in a FIFO. An exclusive-or of pre-image and post-image data is exclusive-or'ed into the exclusive-or memory. When a fault occurs in the computer system, the data stored in the buffer memories, along with the contents of the exclusive-or memory block, are used to ensure that the primary memory is restored to a previous checkpointed state from which the computer system can recover. This structure and protocol can guarantee a consistent state in the primary memory, thus enabling fault-tolerant operation.

REFERENCES:
patent: 3588829 (1971-06-01), Boland
patent: 3736566 (1973-05-01), Anderson et al.
patent: 3761881 (1973-09-01), Anderson et al.
patent: 3803560 (1974-04-01), DeVoy et al.
patent: 3889237 (1975-06-01), Alferness et al.
patent: 3979726 (1976-09-01), Lange et al.
patent: 4020466 (1977-04-01), Cordi et al.
patent: 4044337 (1977-08-01), Hicks et al.
patent: 4164017 (1979-08-01), Randell et al.
patent: 4228496 (1980-10-01), Katzman et al.
patent: 4373179 (1983-02-01), Katsumata
patent: 4393500 (1983-07-01), Imazeki et al.
patent: 4403284 (1983-09-01), Sacarisen et al.
patent: 4413327 (1983-11-01), Sabo et al.
patent: 4426682 (1984-01-01), Riffe et al.
patent: 4435762 (1984-03-01), Milligan et al.
patent: 4459658 (1984-07-01), Gabbe et al.
patent: 4484273 (1984-11-01), Stiffler et al.
patent: 4566106 (1986-01-01), Check, Jr.
patent: 4654819 (1987-03-01), Stiffler et al.
patent: 4734855 (1988-03-01), Banatre et al.
patent: 4740969 (1988-04-01), Fremont
patent: 4751639 (1988-06-01), Corcoran et al.
patent: 4819154 (1989-04-01), Stiffler et al.
patent: 4819232 (1989-04-01), Krings
patent: 4905196 (1990-02-01), Kirrmann
patent: 4924466 (1990-05-01), Gregor et al.
patent: 4941087 (1990-07-01), Kap
patent: 4958273 (1990-09-01), Anderson et al.
patent: 4964126 (1990-10-01), Musicus et al.
patent: 4965719 (1990-10-01), Shoens et al.
patent: 5157663 (1992-10-01), Major et al.
patent: 5214652 (1993-05-01), Sutton
patent: 5239637 (1993-08-01), Davis
patent: 5247618 (1993-09-01), Davis
patent: 5263144 (1993-11-01), Zurawski et al.
patent: 5269017 (1993-12-01), Hayden et al.
patent: 5276848 (1994-01-01), Gallagher et al.
patent: 5313647 (1994-05-01), Kaufman et al.
patent: 5325517 (1994-06-01), Baker et al.
patent: 5325519 (1994-06-01), Long et al.
patent: 5327532 (1994-07-01), Ainsworth et al.
patent: 5369757 (1994-11-01), Spino et al.
patent: 5418916 (1995-05-01), Hall et al.
patent: 5418940 (1995-05-01), Mohan
patent: 5488719 (1996-01-01), Schultz
patent: 5633635 (1997-05-01), Chen et al.
N. Bowen and D. Pradhan, "Processor-and Memory-Based Checkpoint and Rollback Recovery," 1993 IEEE Transactions on Computers, pp. 22-30.
Y. Lee and K. Shin, "Rollback Propagation Detection and Performance Evaluation of FTMR.sup.2 M--A Fault Tolerant Multiprocessor, " 1982 IEEE Transactions on Computers, pp. 171-180.
C. Kubiak et al., "Penelope: A Recovery Mechanism for Transient Hardware Failures and Software Errors," 1982 IEEE Transactions on Computers, pp. 127-133.
A. Feridun and K. Shin, "A Fault-Tolerant Multiprocessor System with Rollback Recovery Capabilities," 1981 IEEE Transactions on Computers, pp. 283-298.
P. Lee et al., "A Recovery Cache for the PDP-11," IEEE Transactions on Computers, vol. C-29 No. 6 Jun. 1980, pp. 546-549.
M. Banatre, A. Gefflaut, C. Morin, "Scalable Shared Memory Multiprocessors: Some Ideas To Make Them Reliable", in Hardware and Software Architectures for Fault Tolerance, Springer-Verlag, 1994 Lecture Notes in Computer Science, presented after Jun. 10, 1993.
Levy et al. "Incremental Recovery in Main Memory Database Systems" IEEE 1992, pp. 529-540.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Main memory system and checkpointing protocol for fault-tolerant does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Main memory system and checkpointing protocol for fault-tolerant, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Main memory system and checkpointing protocol for fault-tolerant will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-992724

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.