Cache thresholding method, apparatus, and program for...

Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C714S023000

Reexamination Certificate

active

06832329

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Technical Field
The present invention relates to data processing and, in particular, to error detection and correction. Still more particularly, the present invention provides a method, apparatus, and program for predicting array bit line or driver failures.
2. Description of Related Art
A system may include an event scan that is invoked periodically for each processor in the system. The system may also include error correction circuitry (ECC) to resolve correctable single-bit errors (CE). Some of these correctable errors may be detected in processor caches. A CPU Guard function can be used to dynamically de-allocate a processor and cache that has an error. A Repeat Guard function can be used to de-allocate the resource during boot process to ensure that the cache with the fault does not cause further errors until a customer engineer is able to fix the error. The system may include field replaceable units (FRU), each of which includes a processor and cache. The customer engineer may fix a fault by replacing the FRU.
A cache may have array bit line or driver failures that may cause correctable errors to be detected. Even though these errors are corrected and do not impact continued operation, it is desirable to detect when CEs are repeatedly caused by these types of cache faults. Prior art algorithms attempt to detect array bit line or driver failures by simply counting to some number of faults within a specified time period. Within those algorithms, however, intermittent single-bit errors caused by random noise or other cosmic conditions may result in many false reports of bit line or driver failures.
In addition, some prior algorithms rely on the system rebooting periodically to reset error threshold counters. However, in today's business-critical computing environments, computer systems are not rebooted very often. This may cause random errors to accumulate go and trigger false reports.
Thus, it would be advantageous to provide a cache thresholding method and apparatus for predictive reporting of array bit line or driver failures that does not generate false error reports because of random errors.
SUMMARY OF THE INVENTION
The present invention provides a mechanism for predicting cache array bit line or driver failures, which is faster and more efficient than counting all of the errors associated with a failure. This mechanism checks for five consecutive errors at different addresses within the same syndrome on invocation of periodic polling to characterize the failure. Once the failure is characterized, it is reported to the system for corrective maintenance including dynamic processor deconfiguration or preventive processor replacement.


REFERENCES:
patent: 5463768 (1995-10-01), Cuddihy et al.
patent: 5761411 (1998-06-01), Teague et al.
patent: 5892898 (1999-04-01), Fujii et al.
patent: 6345322 (2002-02-01), Humphrey
patent: 6438716 (2002-08-01), Snover
patent: 6493656 (2002-12-01), Houston et al.
patent: 6647517 (2003-11-01), Dickey et al.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Cache thresholding method, apparatus, and program for... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Cache thresholding method, apparatus, and program for..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Cache thresholding method, apparatus, and program for... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3318587

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.