Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability
Reexamination Certificate
1998-08-17
2001-08-07
Beausoliel, Jr., Robert W. (Department: 2184)
Error detection/correction and fault detection/recovery
Data processing system error or fault handling
Reliability and availability
C714S006130
Reexamination Certificate
active
06272651
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
This disclosure relates to a computer and, more particularly, to a system interface unit which improves processor read latency in a computer system employing data integrity functionality such as error checking and correction (ECC).
2. Description of the Related Art
Modem computers are called upon to execute instructions and transfer data at increasingly higher rates. Many computers employ CPUs which operate at clocking rates exceeding several hundred MHz, and further have multiple buses connected between the CPUs and numerous input/output devices. The buses may have dissimilar protocols depending on which devices they link. For example, a CPU local bus connected directly to the CPU preferably transfers data at a faster rate than a peripheral bus connected to slower input/output devices. A mezzanine bus may be used to connect devices arranged between the CPU local bus and the peripheral bus. The peripheral bus can be classified as, for example, an industry standard architecture (“ISA”) bus, an enhanced ISA (“EISA”) bus or a microchannel bus. The mezzanine bus can be classified as, for example, a peripheral component interconnect (“PCI”) bus to which higher speed input/output devices may be connected.
Coupled between the various buses are bus interface units. According to somewhat known terminology, the bus interface unit coupled between the CPU bus and the mezzanine bus is often termed the “north bridge”. Similarly, the bus interface unit between the PCI bus and the peripheral bus is often termed the “south bridge”.
The north bridge, henceforth termed a system interface unit, serves to link specific buses within the hierarchical bus architecture. Preferably, the system interface unit couples data, address and control signals forwarded between the CPU local bus, the PCI bus and the memory bus. Accordingly, the system interface unit may include various buffers and/or controllers situated at the interface of each bus linked by the interface unit. In addition, the system interface unit may transfer data to/from a dedicated graphics bus, and therefore may include an advanced graphics port (“AGP”). As a host device, the system interface unit may support both the PCI graphics transfers on the AGP (e.g., graphics-dedicated transfers associated with PCI, henceforth is referred to as a graphics component interface, or “GCI”), as well as AGP extensions to the PCI protocol.
The reliability of data transfers between devices on the CPU local bus, mezzanine bus, peripheral bus, and main memory is of paramount concern to designers and users of computer systems. There are several types of data errors that may occur. For example, soft errors which are non-permanent and hard errors which usually are permanent may occur in the main memory devices. Soft errors are usually caused by radiation-induced switching of a bit in a memory cell, and usually this type of error causes no lasting damage to the memory cell. Hard errors are due to the unexpected deterioration or destruction of one or more memory cells. Static discharge or deterioration over time are often the source of hard errors. Generally, hard errors result in the defective memory device being replaced. Also, errors may occur during data transfers due to switching noise on the data bus, etc.
Data errors have a devastating result on the operation of the computer system, regardless of whether the error is a hard or soft memory error or a transfer error. Erroneous data or a bad program instruction may result. These errors can not be tolerated in some systems, such as servers which supply information to other computers and may be part of a critical system, like a banking system. To avoid the problems caused by memory errors, computer system designers often implement error checking within the system interface unit.
Parity checking is a common means of error checking. Parity checking involves, for example, storing a bit with every byte of information that indicates the internal consistency of that byte. Generally this is as simple as determining if there is an odd number of ones in the byte. Every time a byte is accessed, the parity bit is checked in the system interface unit to determine if the byte is consistent with the parity indication. If a parity error is found, system operation is usually halted since the results could be catastrophic.
However, many computer systems are used in applications that can not tolerate system operation being halted. A technique that is used in systems that cannot tolerate being shut down by parity errors is to store an error checking and correction (ECC) code with each word (double word or quad word) in memory. The ECC allows single bit errors, which would normally cause a parity error, to be detected and corrected without effecting system operation and multiple bit errors to be detected. Typical ECC systems only correct single bit errors. If a multiple bit error is detected, it is treated as a parity error and system operation may be interrupted. Often, if single bit errors are frequently being detected (and corrected) in the same memory area, it is an indication that more serious memory failures may soon occur in that memory area.
To implement parity or ECC, the system interface unit typically includes a data integrity functional unit. All data transfers from main memory pass through the data integrity unit within the system interface unit. For example, when data is written to main memory by a processor or a PCI master, the data integrity unit generates error information (e.g., ECC checkbits) for that data to be stored in main memory along with the data. If a processor requests a read from main memory, the system interface unit will receive the read data and error information bits associated with the read data. The system interface unit will then perform the data integrity function (e.g., parity or ECC) on the requested read data and error information. After the read data and error information have passed through the data integrity unit, the checked and/or corrected read data will then be forwarded by the system interface unit to the processor bus in order to satisfy the processor read request. Similarly, read data and error information will pass through the data integrity unit before being forwarded to the PCI bus to satisfy a PCI read request from main memory.
The advantage of providing data integrity functionality in the system interface unit is that data reliability is greatly enhanced for data transfers between the processor bus or peripheral buses and main memory. However, a disadvantage of including a data integrity function in the system interface data path is that data transfer latency may be increased. For example, a processor read from main memory may take one or more additional clock cycles to perform because it may take one or more clock cycles to perform the data integrity function on the data in the system interface unit before data can be passed to the processor bus. Mezzanine and peripheral bus read latencies may be similarly increased. Write latencies may also be increase because error information must be generated for the write data. Generating the error information (e.g., checkbits or parity bits) may take one or more clock cycles before the write data and error information can be written to main memory by the system interface unit.
The increased latency resulting from the data integrity function in the system interface unit for mezzanine and peripheral bus reads may not be that harmful in most computer systems since mezzanine and peripheral devices are often slower devices that can easily tolerate an additional clock period added to their read latency. Similarly, the increased write latency resulting from the data integrity function in the system interface unit may not be that harmful in most computer systems since writes may usually be posted thus freeing the CPU or other bus.
However, processor read latency is often critical. Modern processors operate at extremely fast cycle rates. Usually, a processor cannot continue operation until a
Chin Kenneth T.
Coffee Clarence Kevin
Collins Michael J.
Johnson Jerome J.
Jones Phillip M.
Beausoliel, Jr. Robert W.
Bonzo Bryce P.
Compaq Computer Corp.
Conley Rose & Tayon PC
Kowert Robert C.
LandOfFree
System and method for improving processor read latency in a... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with System and method for improving processor read latency in a..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for improving processor read latency in a... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2492173