Snoop blocking for cache coherency

Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S140000, C711S141000, C711S152000

Reexamination Certificate

active

06460119

ABSTRACT:

BACKGROUND
The present invention relates to a cache coherency technique in an agent using a pipelined bus.
As is known, many modem computing system employ a multi-agent architecture. A typical system is shown in FIG.
1
. There, a plurality of agents 10-50 communicate over an external bus
60
according to a predetermined bus protocol. “Agents” may include general purpose processors, chipsets for memory and/or input output devices or other integrated circuits that process data requests. The bus
60
may be a “pipelined” bus in which several transactions may be in progress at once. Each transaction progresses through a plurality of stages but no two transactions are in the same stage at the same time. The transactions complete in order. With some exceptions, transactions generally do not “pass” one another as they progress on the external bus
60
.
In a multiple-agent system, two or more agents may have need for data at the same memory location at the same time. The agents 10-50 operate according to cache coherency rules to ensure that each agent
10
uses the most current copy of the data available to the system. According to many cache coherency systems, each time an agent
10
stores a copy of data, it assigns to the copy a state indicating the agent's rights to read and/or modify the data.
For example, the Pentium® Pro processor, commercially available from Intel Corporation, operates according to the “MESI” cache coherency scheme. Each copy of data stored in an agent
10
is assigned one of four states including:
Invalid—Although an agent
10
may have cached a copy of the data, the copy is unavailable to the agent. The agent
10
may neither read nor modify an invalid copy of data.
Shared—The agent
10
stores a copy of data that is valid and possesses the same value as is stored in external memory. An agent
10
may only read data in shared state. Copies of the data may be stored with other agents also in shared state. An agent
10
may not modify data in shared state without first performing an external bus transaction to gain exclusive ownership of the data.
Exclusive—The agent
10
stores a copy of data that is valid and may possess the same value as is stored in external memory. When an agent
10
caches data in exclusive state, it may read and modify the data without an external cache coherency check.
Modified—The agent
10
stores a copy of data that is valid and “dirty.” A copy cached by the agent
10
is more current than the copy stored in external memory. When an agent
10
stores data in modified state, no other agents possess a valid copy of the data.
Agents 10-50 exchange cache coherency messages, called “snoop responses,” during external bus transactions. The snoop responses identify whether other agents possess copies of requested data and, if so, the states in which the other copies are held. For example, when an agent
10
requests data held in modified state by another agent
20
, the other agent
20
may provide the data to the requesting agent in an implicit writeback. Ordinarily, data is provided to requesting agents
10
by the external memory
50
. The modified data is the most current copy of data available to the system and should be transferred to the requesting agent
10
in response to a data request.
When external bus transactions cause an agent to change the state assigned to a copy of data, state changes occur after snoop responses are globally observed.
As an example, consider a “read for ownership” request issued by an agent
10
. Initially, an agent
10
may store the requested data in an invalid state. The agent
10
has a need for the data and issues a bus transaction requesting it. The agent
10
receives snoop responses from other agents 20-40. When the snoop responses are received, the transaction is globally observed. The agent
10
marks the requested data as held in exclusive state. The agent
10
may mark the data even though it has not yet received the requested data. For example, in known processors, data is transferred in a data phase of a transaction following a snoop phase. Before the data is received, an entry of an internal cache (not shown) is reserved for the data. A state field in the external transaction queue is marked as exclusive when the transaction is globally observed and before the requested data is received, but the state field in the reserved cache entry is not marked exclusive until the data is filled into the cache.
Certain boundary conditions arise when state transitions are triggered by the receipt of snoop responses. An example is shown in the following table using the Pentium® Pro bus protocol:
In the boundary condition, without some sort of preventative measure, two different agents
10
and
20
in the system could mark a copy of the same data in exclusive state. To do so would violate cache coherency. Assume that two agents
10
and
20
post read requests to a single piece of data. The first agent
10
posts the request as explained above. When the first transaction concludes its request phase, the second agent
20
posts a second transaction for the same data.
Assume further that the snoop phase of the first transaction is stalled by a snoop stall. A snoop stall signal occurs when an agent (say, agent
30
) requires additional time to generate snoop results. Although the first agent
10
may reserve a cache entry for the requested data, the agent
10
does not mark the requested data as exclusive until snoop results for its transaction are received. When snoop results eventually are received for the first transaction (in clock
8
), the first agent
10
will mark the data as held in exclusive state. However, the first agent
10
observes the second transaction in clock
3
. If it performs internal snoop inquiries for the second transaction before the first transaction is globally observed, its snoop response would indicate that it does not possess a valid copy of the data. The second agent
20
also could mark the data as exclusive. Having two agents
10
,
20
each store data in exclusive state violates the MESI cache coherency rules because each agent
10
,
20
could modify its copy of the data without notifying the other via a bus transaction.
The coherency violation can arise if an agent
10
begins internal snoop inquiries before its previous transaction to the data is globally observed. Thus, the error can be avoided if the snoop inquiries related to the second transaction are blocked until a prior conflicting transaction related to the same data is globally observed.
The Pentium® Pro processor includes a snoop queue to manage cache coherency and generate snoop responses. The snoop queue buffers all transactions posted on the external bus. For new transactions, the snoop queue compares the address of the new transaction to addresses of transactions that it previously stored to determine whether the addresses match. If so, and if the previous transaction were not globally observed, the snoop queue blocks a snoop probe for the new transaction. The block remains until snoop results for the prior pending transaction are received.
The Pentium® Pro processor's snoop queue is large. The snoop queue possesses a queue entry for as many transactions as can be pending simultaneously on the external bus. It consumes a large area when the Pentium® Pro processor is manufactured as an integrated circuit. In future processors, it will be desirable to increase the pipeline depth of the external bus to increase the number of transactions that may proceed simultaneously thereon. However, increasing the depth of the external bus becomes expensive if it also requires increasing the depth of the snoop queue.
The Pentium® Pro processor's snoop queue fills quickly during operation. The snoop queue buffers not only requests from other agents but also requests posted by the agent to which the snoop queue belongs. Because the Pentium® Pro includes an external transaction queue that monitors transactions issued by the processor, the snoop queue's design is considered sub-optimal.
Accordingly, the

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Snoop blocking for cache coherency does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Snoop blocking for cache coherency, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Snoop blocking for cache coherency will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2940124

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.