Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability
Reexamination Certificate
1999-12-14
2003-01-21
Baderman, Scott (Department: 2184)
Error detection/correction and fault detection/recovery
Data processing system error or fault handling
Reliability and availability
C714S054000, C714S764000, C713S324000
Reexamination Certificate
active
06510528
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Technical Field
The present invention relates in general to data processing systems and in particular to the data processing (computer) memory system. Still more particularly, the present invention relates to providing an error correction scheme to the memory system.
2. Description of the Related Art
It was discovered in the mid 1970's that random, unpredictable memory errors were caused by ionization trails left by the passage of “alpha particles.” Many improvements were made in materials technology that reduced the problem to an acceptable level. As the density of memory technology improved, by several orders of magnitude, size of the component parts decreased as well and susceptibility to alpha particles and other subatomic particles increased.
The computer industry responded to this problem by incorporating a technique known as Error Correction Code (ECC). ECC corrects single bit errors in a memory location and detects multiple bit errors. Another technique used in conjunction with ECC is “scrubbing.” Scrubbing is basically the act of writing corrected data back to the memory location that experienced a single bit error. Scrubbing can be implemented either with hardware that automatically writes back to a memory location a corrected bit error or with software that reads and then writes a block of data when notified of one or more single bit errors. The whole point of scrubbing is to minimize single bit errors in memory (that can be handled by ECC correction) so that a memory location is not at risk of having multiple bit errors accumulate that would cause an unrecoverable error. As long as the system is running and frequently accessing memory, these techniques have been proven to work quite well.
In an effort to minimize power consumption while still providing rapid access to computer functions for users, a number of power saving initiatives have been launched in recent years in the personal computer industry. One of these initiatives that has been widely adopted is a standard know as Advanced Configuration and Power Interface (ACPI). This standard defines several states ranging from high power, high speed operation (S0 state) to total power off (S5 state). S0 is the normal running state and the Personal Computer (PC) can consume more than 50 watts of power; at S1 the CPU stop clock is switched off which reduces power consumption to around 30 watts; at S2, the CPU is switched off; at S3, the PC is in a suspend to RAM state, consuming less than 5 watts; S4 is a suspend to disk state or “Soft Off” and zero watts of power are consumed; S5 is the “Off” state. Of interest to this invention is power states S1, S2, and S3.
In S3 state, the central processor unit, core chipset (memory controller and Input/output controller) and all peripheral devices (such as disk drives and monitors) are shut down—drawing no power. The only thing active in the system are the memory chips that are in a low power self refresh state intended to preserve the contents of memory to allow a rapid response of computer usage when the user performs some overt action such as a keyboard input or mouse movement. In S2 state the processor is powered down and in S1 state, the processor still has power but is halted.
In the above states the ECC hardware and Scrubbing functions that tend to prevent fatal multiple errors are ineffective (data is not being fetched from memory to allow ECC function) while the fundamental causes (sub-atomic particles) of many of these errors proceed at their natural pace.
It would be desirable, therefore, to provide a method and apparatus that will enable a data processing system to minimize single bit errors in memory so as to prevent accumulation of multiple bit errors that will cause an unrecoverable error.
SUMMARY OF THE INVENTION
It is therefore one object of the present invention to provide a method and apparatus for changing state in a data processing system from S1, S2, or S3 state to S0 state.
It is another object of the present invention to provide a method and apparatus to initiate a memory scrubbing routine after the state of the data processing system has been changed from S1, S2, or S3.
It is yet another object of the present invention to provide a method and apparatus for detecting and correcting correctable memory errors.
The foregoing objects are achieved as is now described. A periodic system “wake-up” scheme is implemented during S1, S2 or S3 states utilizing a hardware timer or implemented when the system is brought out of S1, S2 or S3 states. A memory scrubbing routine is initiated that reads out all memory locations and writes back any memory locations that have single bit (correctable) ECC errors. This procedure minimizes the chances of a multiple bit error build up over time that may cause an unrecoverable error. The scrubbing routine is invoked whenever the system is brought out of S1, S2, or S3 state to insure that there are no single bit errors present when full system operation is resumed.
The above as well as additional objects, features, and advantages of the present invention will become apparent in the following detailed written description.
REFERENCES:
patent: 4479214 (1984-10-01), Ryan
patent: 5077737 (1991-12-01), Leger et al.
patent: 5263032 (1993-11-01), Porter et al.
patent: 5495491 (1996-02-01), Snowden et al.
patent: 5588112 (1996-12-01), Dearth et al.
patent: 5692121 (1997-11-01), Bozso et al.
patent: 5867718 (1999-02-01), Intrater et al.
patent: 5902352 (1999-05-01), Chou et al.
patent: 5937200 (1999-08-01), Frid et al.
patent: 5978952 (1999-11-01), Hayek et al.
patent: 6016549 (2000-01-01), Matsushiba et al.
patent: 6065123 (2000-05-01), Chou et al.
patent: 6119248 (2000-09-01), Merkin
patent: 6292869 (2001-09-01), Gerchman et al.
patent: 6405320 (2002-06-01), Lee et al.
Freeman Joseph Wayne
Karpel Isaac
Springfield Randall Scott
Baderman Scott
Bracewell & Patterson LLP
LandOfFree
Method for improving personal computer reliability for... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method for improving personal computer reliability for..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for improving personal computer reliability for... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3036947