Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability
Reexamination Certificate
2000-06-16
2003-06-03
Le, Dieu-Minh (Department: 2184)
Error detection/correction and fault detection/recovery
Data processing system error or fault handling
Reliability and availability
C714S013000
Reexamination Certificate
active
06574748
ABSTRACT:
FIELD OF THE INVENTION
This invention relates to computer central processors and, more particularly, the swapping of physical processors when one is found defective without having to reboot the operating system.
BACKGROUND OF THE INVENTION
As personal computers and workstations have become more and more powerful, makers of mainframe computers have undertaken to provide features which cannot readily be matched by these smaller machines in order to stay viable in the market place. One such feature may be broadly referred to as fault tolerance which means the ability to withstand and promptly recover from hardware faults and other faults without the loss of crucial information. The central processing units (CPUs) of mainframe computers typically have error and fault detection circuitry, and sometimes error recovery circuitry, built in at numerous information transfer points in the logic to detect and characterize any fault which might occur.
The CPU(s) of a given mainframe computer comprises many registers logically interconnected to achieve the ability to execute the repertoire of instructions characteristic of the CPU(s). In this environment, the achievement of genuinely fault tolerant operation, in which recovery from a detected fault can be instituted at a point in a program immediately preceding the faulting instruction/operation, requires that one or more recent copies of all the software visible registers (and supporting information also subject to change) must be maintained and constantly updated. This procedure is typically carried out by reiteratively sending copies of the registers and supporting information (safestore information) to a special, dedicated memory or memory section.
When a fault occurs and analysis determines that recovery is possible, the safestore information is used to reestablish the software visible registers in the CPU with the contents held recently before the fault occurred so that restart can be instituted or tried from the corresponding place in program execution.
Typically, when one processor in a data processing system fails, at best, the process running on that processor is aborted. In many cases, including the case where the operating system (OS) had control of the processor when it crashed, the entire operating system crashes. When the system recovers, typically after a reboot, it will run in degraded mode, with that failed processor being disabled until it can be replaced or repaired. Obviously, if this is the only processor in the data processing system, the system is down until the repair or replacement can be accomplished. In all cases though, the loss of that failed processor results in degraded performance.
It would be advantageous then for a data processing system to be able to recover from the failure of a single processor. In particular, it would be advantageous if the data processing system could recover so that no processes are lost nor is any performance lost.
REFERENCES:
patent: 5327553 (1994-07-01), Jewett et al.
patent: 5408649 (1995-04-01), Beshears et al.
patent: 5495569 (1996-02-01), Kotzur
patent: 5627962 (1997-05-01), Goodrum et al.
patent: 5862312 (1999-01-01), Mann et al.
patent: 5983359 (1999-11-01), Nota et al.
patent: 5996089 (1999-11-01), Mann et al.
patent: 6115829 (2000-09-01), Slegel et al.
patent: 6158015 (2000-12-01), Klein
patent: 6189112 (2001-02-01), Slegel et al.
Andes Curtis D.
Andress Sidney L.
Rightnour Gerald E.
Smith James R.
Bull HN Information Systems Inc.
Hayden B. E.
Le Dieu-Minh
Solakion J. S.
LandOfFree
Fast relief swapping of processors in a data processing system does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Fast relief swapping of processors in a data processing system, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Fast relief swapping of processors in a data processing system will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3121711