System and method for detecting double-bit errors and for...

Error detection/correction and fault detection/recovery – Pulse or data error handling – Digital data error correction

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C714S785000

Reexamination Certificate

active

06453440

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to error detection and correction in electronic systems and, more particularly, to systems that employ error correction codes to facilitate detection and correction of bit errors.
2. Description of the Related Art
Error codes are commonly used in electronic systems to detect and correct data errors, such as transmission errors or storage errors. For example, error codes may be used to detect and correct errors in data transmitted via a telephone line, a radio transmitter, or a compact disc laser. Error codes may additionally be used to detect and correct errors associated with data stored in the memory of computer systems. One common use of error codes is to detect and correct errors of data transmitted on a data bus of a computer system. In such systems, error correction bits, or check bits, may be generated for the data prior to its transfer or storage. When the data is received or retrieved, the check bits may be used to detect and correct errors within the data.
Component failures are a common source of error in electrical systems. Faulty components may include faulty memory chips or faulty data paths provided between devices of a system. Faulty data paths can result from, for example, faulty pins, faulty data traces, or faulty wires.
Hamming codes are a commonly used type of error code. The check bits in a Hamming code are parity bits for portions of the data bits. Each check bit provides the parity for a unique subset of the data bits. If an error occurs (i.e. one or more of the data bits unintentionally change state), one or more of the check bits upon regeneration will also change state (assuming the error is within the class of errors covered by the code). By determining the specific bits of the regenerated check bits that changed state, the location of the error within the data may be determined. For example, if one data bit changes state, this data bit will cause one or more of the regenerated check bits to change state. Because each data bit contributes to a unique group of check bits, the check bits that are modified will identify the data bit that changed state. The error may be corrected by inverting the bit identified as being erroneous.
One common use of Hamming codes is to correct single bit errors within a group of data. Generally speaking, the number of check bits must be large enough such that 2
k
−1 is greater than or equal to n+k where k is the number of check bits and n is the number of data bits. Accordingly, seven check bits are typically required to implement a single error correcting Hamming code for 64 data bits. A single error correcting Hamming code is capable of detecting and correcting a single error.
FIGS. 1-3
illustrate an example of a system employing a single-error correction (SEC) Hamming code. In this example, four data bits (D
4
, D
3
, D
2
, and D
1
) are protected using three check bits (P
1
, P
2
, and P
3
). The parity generator
10
(
FIG. 1
) is used to encode the data block that contains the data bits and the check bits. The encoding process is performed prior to storing or communicating the data.
FIG. 2
shows an assignment of data bits to calculate the check bits. In this example, the check bit P
1
is generated by an XOR (exclusive OR) of the binary values in D
4
, D
3
, and D
1
. Similarly, the check bit P
2
is generated by an XOR of the binary values in D
4
, D
2
, and D
1
, and the check bit P
3
is generated by an XOR of the binary values in D
3
, D
2
and D
1
.
FIG. 3
shows the bit positions and the corresponding content of these positions within the encoded data block. The data block, which includes the data bits and the generated check bits, may then be stored in a memory chip or communicated over a data communication path.
At the point of receipt, the data block is retrieved and decoded. The decoding process involves performing a validity check on the received word, and executing an error correction technique if an error was detected. To check whether an error occurred in the storage (or transmission) of the data block, the check bits P
1
, P
2
, and P
3
are effectively regenerated using the received data, and each regenerated check bit is XORed with the corresponding received check bit to generate a corresponding syndrome bit.
FIG. 4
is a table depicting a manner in which these syndrome bits may be generated. More particularly, syndrome bit S
1
may be generated by XORing the received binary values in P
1
, D
4
, D
3
, and D
1
. If none of the received data bits (D
4
, D
3
, D
1
) is erroneous, the value of the received check bit P
1
is effectively XORed with itself, and the syndrome bit S
1
will be 0 (assuming the original check bit P
1
is not erroneous). If one of the data bits (D
4
, D
3
, D
1
) or the check bit P
1
is erroneous, the syndrome bit S
1
will be 1 (asserted), thus indicating an error. Syndrome bits S
2
and S
3
may be generated similarly. Taken collectively, the syndrome bits S
1
, S
2
and S
3
may be used to identify the location of an erroneous bit. For example, the binary value of the syndrome bits in the order [S
3
, S
2
, S
1
] indicates the position of the erroneous bit within the 7 bit data block as depicted in FIG.
3
. If the syndrome code is all zeros (i.e. “000”), the data has no single bit error. Upon identification of the erroneous bit position, the error is corrected by inverting the binary value in that position, i.e. from 0 to 1 or from 1 to 0.
It is a common practice to store data in, or communicate data through, multiple components. For example, a data block may be stored in a plurality of memory chips, or it may be communicated through a plurality of wires. An error may be introduced if one of the components is faulty. A Hamming code such as that described above may be used to address error correction in such systems.
For example, consider the case of storing D bits of data that are protected by C check bits using M memory chips. The data block therefore contains D+C bits. If the data block is to be evenly divided among the M memory chips, each memory chip will store X of the data and/or check bits of the data block, where X=(D+C)/M. The standard approach to providing error correction for chip failures is to divide the D+C data and check bits into X logical groups each including M bits, and assigning 1 bit from each chip to each of the groups. The check bits in each group form a SEC (single-error correcting) code such as a Hamming code. When any chip fails, it introduces at most one error into each group, and these errors are corrected independently using the SEC codes. If a Hamming code is used in each group, a total of C=X*L check bits are required, where L is the smallest integer such that 2{circumflex over ( )}L>M. This standard approach is inefficient because each group is able to independently identify which bit (if any) within the group is in error. However, if the only failures considered are memory chip failures, the failures in different groups are highly correlated.
In some systems, in addition to correcting single-bit errors due to component failures, it may also be desirable to detect any double-bit errors that may occur. The standard approach is to evenly divide the data block among the memory chips in the manner as described above, and to generate check bits for each group which form an SEC-DED (single-error correcting, double-error detecting) code such as an extended Hamming code. When any chip fails, it introduces at most one error into each group, and these errors are corrected independently using the SEC-DED codes. When two arbitrary bits are in error, they are either corrected (if they lie in different groups) or are detected (if they lie in the same group). If an extended Hamming code is used in each group, a total of C=X*L check bits are required, where L is the smallest integer such that 2{circumflex over ( )}(L−1)>M. Similar to the foregoing discussion, however, the use of extended Hamming codes in suc

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and method for detecting double-bit errors and for... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System and method for detecting double-bit errors and for..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for detecting double-bit errors and for... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2882280

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.