Electrical computers and digital data processing systems: input/ – Access arbitrating
Reexamination Certificate
1999-08-17
2002-11-26
Dharia, Rupal (Department: 2181)
Electrical computers and digital data processing systems: input/
Access arbitrating
C710S107000, C710S305000, C711S146000
Reexamination Certificate
active
06487621
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention generally relates to an architecture, system and method for maintaining cache coherency and processor consistency within a multi-processor computer system. More particularly, this invention relates to maximizing throughput within multiple processor buses by minimizing snoop stall cycles or out-of-order transactions on at least one of the processor buses.
2. Description of the Related Art
Multi-processor systems are generally well known, whereby a set of processors are interconnected across a local or distributed network. The local network can be confined to a single computer and the distributed network can be one involving, for example, a LAN or WAN. Each processor may either be interconnected with other processors on a single processor bus or connected on its own respective processor bus separate from other processor buses. In the former instance, the processors are said to have been grouped as “clusters.” For example, a Pentium® Pro processor bus can support up to four Pentium® Pro processors. Each cluster can thereby be connected to a processor bus and routed to a system memory bus via a system memory controller or bus bridge.
Most modern day processor buses use a pipeline architecture. More specifically, dissimilar stages (or phases) of each transaction can occur with other phases of another transaction so as to service multiple phases of multiple transactions on a singular processor bus. In the Pentium® Pro example, each transaction can employ several phases that can include some or all of the following phases: arbitration phase, request phase, error phase, snoop phase, response phase, and data phase.
During an arbitration phase, the processor bus requesting agent seeks mastership of its respective bus, as granted by an arbiter. A processor is deemed a bus agent and, if multiple processors are arranged in a cluster, arbitration among those processors is granted using, for example, a prioritized or symmetric bus arbitration mechanism. Once a requesting agent is granted ownership or mastership of the bus, a requesting agent will drive an address within a request phase. Provided no errors are discovered for that transaction, as recorded in the error phase, a snoop phase is initiated. In the snoop phase, cache coherency is enforced. Mainly, all bus agents which receive a snoop cycle route a hit or a hit-to-modified signal to the bus agent which requested the snoop cycle. The resulting response of the transaction is then driven during the response phase by the responding bus agent. Thereafter, a data phase will occur for that transaction, provided it had not been deferred.
The snoop results are driven by the processor bus agents during the snoop phase. Those results indicate whether the corresponding snoop request address references a valid or dirty cache line in the internal cache of a bus agent coupled to the processor bus. The dirty cache line is often referred to as a “modified” cache line. The values of HIT# and HITM# are used to indicate whether the line is valid or invalid in the addressed agent being snooped, whether the line is dirty (modified) in the caching agent, or whether the snoop phase needs to be extended. The bus agent being snooped (i.e., “caching” agent) will assert HIT# and de-assert HITM# in the snoop phase if the agent plans to retain the cache line in its cache after the snoop is completed. The caching agent will assert HITM# if its cache line is in a modified state (i.e., indicative of the caching agent containing the most recent cache line date for that address). After asserting HITM#, the caching agent will assume responsibility for writing back the modified cacheline, often referred to as “implicit write back.” If the caching agent asserts HITM# and HIT# together in the snoop phase, then a snoop stall will occur so as to stretch the completion of the snoop phase for as long as needed to ensure the caching agent will eventually be ready to indicate snoop status.
If a DEFER# signal is forwarded during the snoop phase from the caching agent, hat agent will effectuate removal of that particular transaction from the in-order queue, often referred to as a “IOQ”. During the response phase, responses to a DEFER# signal forwarded during the snoop phase will be indicated by one of three valid responses: deferred response, retry response, or hard error response. If a DEFER# is initiated during a snoop cycle and a response indicates either a deferred response or a retry response, it will be noted that the deferred transaction will be requested out-of-order from its original request. According to one example, the deferred request may occur subsequent to a snooping request cycle to indicate an out-of-order sequence or, alternatively, split transaction.
Most modem day processor buses rely on procedures or operations that appear somewhat atomic, in that processor buses generally retire data in the order that transactions begin. Moreover, data transfers of one transaction are often dependent upon data transfers of another transaction. For example, completion of a request from memory may require implicit write back of data from a caching agent if the request is to a modified line in that caching agent.
FIG. 1
illustrates the atomic nature of transactions and the dependency of those transactions within a pair of processor buses of a multi-processor system. For example, a first processor on a first processor bus “1” requests a transaction A on the first bus. Following transaction A, a snoop request A
s
will be forwarded to the second processor on the second processor bus and specifically to the cache within the second processor. Meanwhile, the second processor dispatches transaction B on the second processor bus, eventually yielding a snoop transaction B
s
on the first processor bus and specifically to the cache within the first processor. If both snoop requests yield a hit-to-modified signal (HITM#) being asserted, a live-lock condition may result whereby both buses are locked and unable to forward the data requested since that data is contingent upon receiving the modified cache line from the opposing bus' caching agent. More specifically, relative to the first bus, the modified data for transaction B cannot be driven on the first bus until transaction A receives its data. On the second bus, the modified data for transaction A cannot be driven on the second bus until transaction B receives its data. The pipeline transfer of responses and data is thereby maintained in a locked condition, thus preventing further transfers of data across Bus
1
and Bus
2
.
It may be that in order to prevent a live-lock condition, a DEFER# signal will need to be forwarded during the snoop phase. The DEFER# signal will be forwarded across the first and second buses as DEFER
1
# and DEFER
2
# as shown by reference numerals
10
and
12
, respectively. Asserting the defer signals whenever a hit-to-modified HITM
x
# occurs on that bus (where X is either 1 or 2) will ensure that all transactions on the respective buses will be deferred. Even if a hit-to-modified signal is present on only one bus, transactions on both buses may be deferred. Even though a hit-to-modified signal occurring on both buses is relatively small, the technique of deferring transactions on both buses not only may be unnecessary, but also consumes substantial bus bandwidth since the deferred transaction must later be completed with a deferred reply.
Alternatively, the multi-processor system may utilize a central tag controller which links up all coherent transaction addresses in a tag filter to see if the remotely located processor bus agent may own the snooped address. If there is a hit to the tag filter, the snooping agent maintains ownership of its respective processor bus. This allows the local transaction to complete on the local processor bus. If the transaction on the remote bus hits a modified address noted in the tag filter, the remote processor will be required to d
Compaq Information Technologies Group L.P.
Conley & Rose & Tayon P.C.
Daffer Kevin L.
LandOfFree
Architecture, system and method for ensuring an ordered... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Architecture, system and method for ensuring an ordered..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Architecture, system and method for ensuring an ordered... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2916680