Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability
Reexamination Certificate
2000-02-16
2003-04-22
Beausoliel, Robert (Department: 2184)
Error detection/correction and fault detection/recovery
Data processing system error or fault handling
Reliability and availability
C714S048000, C714S010000
Reexamination Certificate
active
06553512
ABSTRACT:
TECHNICAL FIELD
The technical field relates generally to digital computer systems and more particularly, but not by way of limitation, to systems for detecting errors within the instructions processed in such computer systems.
BACKGROUND
A central processing unit (CPU) may stop making forward progress for various reasons. For example, a CPU deadlock may occur when the code makes a memory reference to a non-existing memory. In some systems, the memory controllers will not respond to such an erroneous memory reference, causing the system to deadlock, waiting for data to return from a memory that does not exist. When a CPU deadlock occurs, there must be some mechanism for releasing the CPU from this deadlocked state.
One such mechanism is the triggering of a bus error to clear the deadlock. However, triggering a bus error substantially impacts the system by requiring the system to be restarted. In particular, triggering a bus error requires resetting the memory controllers. Triggering a bus error is expensive in terms of time and software required to fix the problem. A bus may have multiple CPUs, in which case all of them usually must be reset upon the triggering of a bus error.
What is needed is method and an apparatus to resolve the CPU deadlock without triggering a bus error, if possible. In particular, what is needed is a method of attempting to resolve the CPU deadlock first through software, and then, if that method fails, invoking traditional methods of resolving the deadlock, such as triggering a bus error.
SUMMARY
A method is provided for handling errors that deadlock a CPU by first attempting to resolve the deadlock without issuing a bus error and without restarting the computer. If the deadlock cannot be resolved without issuing a bus error, then a bus error is issued and the computer attempts to restart itself. The method involves comparing the number of clock cycles taken to execute an instruction to a designated abort value. When the instruction has taken the full abort value of cycles but has not retired, a machine-check abort (MCA) is issued to attempt to resolve the deadlock. The method also involves comparing the number of clock cycles to a larger bus error value. If the MCA does not break the deadlock within a certain period—i.e., before the bus error value is reached—then a bus error is issued and the computer attempts to reset.
A computer system includes a CPU, a counter, and a software programmable register. The counter determines the number of clock cycles consumed during the execution of an instruction and stores that number in the register. The number of clock cycles taken is compared to execute an instruction to a designated abort value. When an instruction has taken the full abort value of cycles but has not retired, a machine-check abort (MCA) is issued to attempt to resolve the deadlock. The number of clock cycles is also compared to a larger bus error value. If the MCA does not break the deadlock within a certain period—i.e., before the bus error value is reached—then a bus error is issued and the CPU attempts to reset itself.
REFERENCES:
patent: 4348722 (1982-09-01), Gunter et al.
patent: 5006980 (1991-04-01), Sanders et al.
patent: 5664088 (1997-09-01), Romanovsky et al.
patent: 5682551 (1997-10-01), Pawlowski et al.
patent: 5889975 (1999-03-01), Allingham
patent: 6247118 (2001-06-01), Zumkehr et al.
patent: 6292910 (2001-09-01), Cummins
patent: 6453430 (2002-09-01), Singh et al.
Microsoft Corporation. Microsoft Knowledge Base Article—Q171773, How to Eliminate a Process That Is Not Responding Without Restarting the Computer. Jul. 23, 1997.
Beausoliel Robert
Hewlett -Packard Development Company, L.P.
Wilson Yolanda L.
LandOfFree
Method and apparatus for resolving CPU deadlocks does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for resolving CPU deadlocks, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for resolving CPU deadlocks will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3103479