Electrical computers and digital data processing systems: input/ – Intrasystem connection – Bus access regulation
Reexamination Certificate
1998-09-30
2001-04-10
Kim, Kenneth S. (Department: 2183)
Electrical computers and digital data processing systems: input/
Intrasystem connection
Bus access regulation
C710S108000, C710S120000, C711S146000
Reexamination Certificate
active
06216190
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to a computer and, more particularly, to a bus interface unit which can stall read and non-postable write cycles issued to a peripheral bus until the peripheral bus becomes available or a cycle occurs to system memory. If a cycle is issued to system memory, then the bus interface unit either defers or retries the prior cycles to the peripheral bus after the cycle to system memory completes.
2. Description of the Related Art
Modem computers are called upon to execute instructions and transfer data at increasingly higher rates. Many computers employ CPUs which operate at clocking rates exceeding several hundred MHz, and further have multiple busses connected between the CPUs and numerous input/output devices. The busses may have dissimilar protocols depending on which devices they link. For example, a CPU local bus connected directly to the CPU preferably transfers data at a faster rate than a peripheral bus connected to slower input/output devices. A mezzanine bus may be used to connect devices arranged between the CPU local bus and the peripheral bus. The peripheral bus can be classified as, for example, an industry standard architecture (“ISA”) bus, an enhanced ISA (“EISA”) bus or a microchannel bus. The mezzanine bus can be classified as, for example, a peripheral component interconnect (“PCI”) bus to which higher speed input/output devices can be connected.
Coupled between the various busses are bus interface units. According to somewhat known terminology, the bus interface unit coupled between the CPU bus and the PCI bus is often termed the “north bridge”. Similarly, the bus interface unit between the PCI bus and the peripheral bus is often termed the “south bridge”.
The north bridge, henceforth termed a bus interface unit, serves to link specific busses within the hierarchical bus architecture. Preferably, the bus interface unit couples data, address and control signals forwarded between the CPU local bus, the PCI bus and the memory bus. Accordingly, the bus interface unit may include various buffers and/or controllers situated at the interface of each bus linked by the interface unit. In addition, the bus interface unit may receive data from a dedicated graphics bus, and therefore may include an advanced graphics port (“AGP”). As a host device, the bus interface unit may be called upon to support both the PCI portion of the AGP (or graphics-dedicated transfers associated with PCI, henceforth is referred to as a graphics component interconnect, or “GCI”), as well as AGP extensions to the PCI protocol.
There are numerous tasks performed by the bus interface unit. For example, the bus interface unit must orchestrate timing differences between a faster CPU local bus and a slower mezzanine bus, such as a PCI bus. The bus interface unit should also give priority to certain types of transfers. For example, a cycle initiated by the CPU to memory must, in most instances, be completed quickly. If not, the processor-to-memory queue may not be optimally filled and instructions may not be expeditiously executed.
One mechanism in which to account for timing differences involves, for example, stalling cycles within the CPU local bus to allow the peripheral bus to catch up. This, however, penalizes CPU throughput and should be used only sparingly and judiciously. Stalling the CPU bus typically occurs during a particular transaction phase of the CPU bus pipeline. It is noted that modem CPUs utilize an extensive pipeline which can store multiple cycles of multiple transactions upon the CPU local bus. For example, a Pentium® Pro processor bus includes a decoupled, 12-stage super pipelined implementation. A transaction relating to a single bus request can sequentially pipeline through numerous phases: arbitration, request, error, snoop, response and data transfer.
Stalling the CPU local bus generally involves stalling one or more cycles in the snoop phase. This affords the earlier phases to receive cycles and have those cycles available in the snoop phase. If called upon, those cycles can be released in a timely fashion to the subsequent response and data transfer phases.
FIG. 3
illustrates a timing diagram of exemplary transaction phases of the Pentium® Pro processor bus. In the example shown, a cycle
8
a
of a first transaction
8
requires approximately three bus clock cycles to obtain mastership of the CPU local bus. Approximately two clock cycles later, cycle
8
a
proceeds from the arbitration phase to a request phase
8
b
. As shown, the cycle
8
c
begins in the error phase approximately three clocks after the request phase. Cycle
8
d
occurs approximately four clocks after the request phase or approximately three clocks after the previous transaction snoop cycle, whichever is later. The cumulative number of clock cycles needed to place a transaction within the snoop phase is shown to be approximately ten clock cycles, in the example provided. Of course, as transaction
8
progresses to the snoop phase, a cycle
9
d
of another transaction
9
can subsequently arrive in the snoop phase as well.
If the first transaction
8
is initiated from the CPU to a peripheral device as its final destination, then it may be necessary to delay the transaction in the snoop phase to allow the peripheral bus to clear and/or data upon the peripheral bus to become available. For example, if a transaction preceding the first transaction
8
is a non-postable write to the peripheral device, then it is necessary that the peripheral device and the peripheral bus become available before data of transaction
8
is presented upon the bus. Alternatively, if transaction
8
is a read transaction, it is necessary that the data to be read from the peripheral device be present on the peripheral bus before the local CPU bus can transfer that data during the data transfer phase. For at least these reasons, cycles within the CPU bus destined for a slower peripheral bus must occasionally be stalled in the snoop phase of the CPU bus until the peripheral bus clears and/or data therein is available.
Stalling the CPU bus at the snoop phase is typically done a fixed number of clock cycles. That is, historical differences between the peripheral bus (and peripheral device) and the CPU bus speed indicates that the peripheral bus or data on the peripheral bus will be made available some time after a transaction is completed on the CPU bus. The next transaction to the peripheral is then stalled a fixed amount of time mandated by the historically derived differences in the bus speeds. Thus, regardless of destinations for the subsequent transactions, the current transactions are stalled a fixed number of clock cycles to allow the peripheral bus to clear. This, unfortunately, will penalize throughput of all subsequent cycles (including memory cycles).
In an attempt to immediately service transactions to local memory (and i.e., system memory of substantially contiguous semiconductor memory space) many conventional techniques allow memory cycles to be completed through the CPU bus ahead of cycles to peripheral devices. This involves a technique known as cycle “deferral” of preceding, slower peripheral-destined cycles, and allowing faster, memory-destined cycles to be drawn from the in-order queue of the pipeline.
Referring to FIG.
3
and the two-transaction example shown, deferral of first transaction
8
may occurs at the snoop phase by tagging transaction
8
and allowing the second transaction
9
to proceed as cycles
9
e
and
9
f
within respective response and data transfer phases. In this manner, priority is given to a transaction which must be quickly serviced over that of another transaction which need not be transferred as quickly, possibly due to the slower nature of its destination device. Accordingly, the example shown in
FIG. 3
illustrates a first transaction
8
destined for a slower peripheral device coupled to either a mezzanine bus or a peripheral bus, whereas the second transaction
9
is destined for semiconductor memory.
An unf
Chin Kenneth T.
Coffee Clarence K.
Collins Michael J.
Johnson Jerome J.
Jones Phillip M.
Compaq Computer Corporation
Conley Rose & Tayon
Daffer Kevin L.
Kim Kenneth S.
LandOfFree
System and method for optimally deferring or retrying a... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with System and method for optimally deferring or retrying a..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for optimally deferring or retrying a... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2473382