Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability
Reexamination Certificate
2001-03-26
2004-12-21
Beausoliel, Robert (Department: 2113)
Error detection/correction and fault detection/recovery
Data processing system error or fault handling
Reliability and availability
C714S048000, C714S052000, C714S700000, C714S701000, C710S106000
Reexamination Certificate
active
06834362
ABSTRACT:
BACKGROUND
1. Field of the Invention
The present invention relates to detecting communications errors between functional units in a computer system. More specifically, the present invention relates to an apparatus and a method for detecting errors on a source-synchronous bus within a computer system.
2. Related Art
It is essential for the various functional units of a computing system to communicate with each other in order for the computing system to perform its assigned tasks. Traditionally, these functional units, which include the central processing unit, memory, I/O devices, and the like, are coupled together by a bus structure. When a first functional unit needs to communicate with a second functional unit, the first functional unit typically requests access to the bus from a bus master. The bus master then grants the first functional unit exclusive access to the bus for a bus transaction. During the transaction, the bus is not available to the other functional units.
In older, slower computing systems, a global clock signal is distributed to each of the functional units. Typically, within each function unit, the global clock signal is regenerated using circuitry such as a phase-lock loop. Regenerating this global clock signal removes any noise on the clock signal, thereby virtually eliminating errors in data transferred between functional units attributable to the clock signal.
More modern computing systems, however, operate at higher clock rates, which causes problems when a global clock signal is used. Since the global clock signal arrives at a functional unit by a different route than the data signals, the clock signal can be offset in time from the data signals. At these higher clock rates, the margins for error are much smaller than the margins at lower clock rates.
In an effort to alleviate the problems associated with global clock signals, designers have developed source-synchronous buses. A source-synchronous bus differs from older bus systems in that the clock signal is routed along with the other data signals between the source and the destination. Care is taken to provide the same path lengths and environment for the clock as for the data. Source-synchronous buses, therefore, essentially eliminate any offset in time between the clock signal and the associated data signals.
However, using a source-synchronous bus can cause problems. The clock signal is not regenerated in the source-synchronous bus as it is for the global clock signal used in previous structures, but instead, the clock signal is used as received. Since the clock signal used with a source-synchronous bus is subject to the same environment as the data signals, the clock signals are subject to the same kind of errors as are found on the data signals.
However, errors on the clock signals are more catastrophic than similar errors on the data signals. This is so because a faulty clock signal affects all of the data signals whereas a faulty data signal affects only the one data signal. This makes error detection on a source-synchronous bus difficult. For example, a parity bit can be used to detect a single bit error. A clock error, however, affects multiple bits and may not be detected by a parity bit.
Furthermore, an error correcting code typically corrects a single bit error and detects most multiple bit errors. However, if the clock signal is faulty, it is probable that more than one data bit would be affected, thereby negating the advantage of using the error correcting code.
What is needed is an apparatus and a method for detecting errors on a source-synchronous bus while maintaining the high throughput associated with a source-synchronous bus.
SUMMARY
One embodiment of the present invention provides a system for detecting errors on a source-synchronous bus. The source-synchronous bus includes a plurality of data lines and a clock line. A transmitting mechanism configured to transmit data on the source-synchronous bus is coupled to the source-synchronous bus. A receiving mechanism configured to receive data from the source-synchronous bus is also coupled to the source-synchronous bus. An error detecting mechanism configured to detect errors on the source-synchronous bus is coupled to the receiving mechanism. The error detecting mechanism can detect errors on the plurality of data lines including errors that are caused by an error on the clock line.
In one embodiment of the present invention, the system includes a grouping mechanism coupled to the transmitting mechanism that is configured to group data bits into an error group. The system also includes a detection code generating mechanism coupled to the grouping mechanism that is configured to generate a detection code for the error group. The transmitting mechanism is further configured to transmit the detection code on the source-synchronous bus using a clock cycle other than the clock cycle used for the error group.
In one embodiment of the present invention, the detection code is a parity bit.
In one embodiment of the present invention, the detection code is an error correcting code.
In one embodiment of the present invention, the grouping mechanism is configured to skew data bits within the error group across time.
In one embodiment of the present invention, skewing data bits across time includes delaying each data bit based on the position of the data bit within the error group.
In one embodiment of the present invention, the system provides a gathering mechanism coupled to the receiving mechanism that is configured to de-skew the data bits within the error group.
REFERENCES:
patent: 5784393 (1998-07-01), Byers et al.
patent: 6178206 (2001-01-01), Kelly et al.
patent: 6209072 (2001-03-01), MacWilliams et al.
patent: 6622256 (2003-09-01), Dabral et al.
patent: 6697974 (2004-02-01), Craft
patent: 6704890 (2004-03-01), Carotti et al.
patent: 2002/0087921 (2002-07-01), Rodriguez
patent: 2002/0157062 (2002-10-01), Greiner
patent: 2002/0174390 (2002-11-01), Craft
Beausoliel Robert
Grundler Edward J.
Manoskey Joseph D
Park Vaughan & Fleming LLP
Sun Microsystems Inc.
LandOfFree
Apparatus and method for error detection on... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Apparatus and method for error detection on..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Apparatus and method for error detection on... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3298476