Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories
Reexamination Certificate
1998-11-12
2002-04-23
Gossage, Glenn (Department: 2187)
Electrical computers and digital processing systems: memory
Storage accessing and control
Hierarchical memories
C711S122000, C711S146000
Reexamination Certificate
active
06378048
ABSTRACT:
BACKGROUND OF THE INVENTION
The present invention relates to an improved cache coherency scheme in a multi-processor system.
FIG. 1
illustrates a typical multi-processor system having a plurality of agents
10
-
50
. The plurality of agents
10
-
50
are in communication with each other over a common external bus
60
. An “agent” may be anything that communicates over the external bus, including microprocessors, input/output devices, memory systems and special-purpose chipsets or digital signal processors. The agents
10
-
50
communicate over the external bus
60
using a pre-defined protocol. Typically, one of the agents, such as
50
, is a memory storing data. During operation, other agents
10
-
40
may share the same data. Cache coherency systems ensure that each agent operates on the most current copy of data available.
One such cache coherency system is the MESI (pronounced “messy”) system. The MESI system defines four states for data. One of the four states is applied to each copy of data stored in an agent's internal cache(s). The MESI states are:
Invalid—Although the agent may have cached a copy of data, the copy is unavailable to the agent. When the agent requires the data, the agent must fetch the data from external memory
50
or from another cache.
Shared—A cached copy is valid and possesses the same value as is stored in external memory
50
. The agent may only read the data. Copies of the data may be stored in the caches of other agents. An agent
10
may not modify data having a shared state without first performing an external bus transaction to ensure that the agent has exclusive control over the copy of data.
Exclusive—The cached copy is valid and may possess the same value as is stored in external memory
50
. When an agent
10
caches data having an exclusive state, it may read and write (modify) the data without an external cache coherency check. The data must be invalid in all other agents. The agent that stores data in an exclusive state is guaranteed to have the most up-to-date copy within the system somewhere in its cache hierarchy.
Modified—The cached copy is valid and is dirty. It may be more current than the copy stored in external memory
50
. When an agent
10
caches data having a modified state, it may read and write (modify) the data without an external cache coherency check. The data must be invalid in all other agents. The agent that stores data in a modified state is guaranteed to have the most up-to-date copy within the system somewhere in its cache hierarchy.
Only one state may attach to a given copy of data. For example, a single copy of data may not be both modified and shared. However, as described below, a single agent may store copies of data in multiple caches. Some copies may be assigned a different state than other copies in a single agent. Further explanation of MESI principles may be found in the
Pentium Pro
-
Family Developers Manual,
Volume 1: Specifications, ISBN 1-55512-259-0 (1996).
FIG. 2
illustrates a multiple cache system that may be used in an agent
10
. High performance agents often include multiple caches arranged in layers to reduce the impact of memory latency and bandwidth. A lowest, layer cache (“LØ”) typically has a small capacity but is designed to be very fast. One or more higher layer caches L1, L2 typically are larger than the LØ cache but are accessed at a lower frequency. The higher layer caches L1, L2, however, still operate at a much higher frequency than external memory
50
. Copies of a single piece of data may be stored in multiple caches. State information is stored with each copy. Further, the state of data may be different in different layers. For example, data may be read into an L1 cache as exclusive data and, later, be read to the LØ cache and modified. The L1 copy remains in an exclusive state even though the LØ copy is in a modified state.
In a multi-layer cache system, the MESI system can cause cache coherency problems within an agent
10
. The goal of cache coherency systems is to provide the most current copy of data to any agent that will use the data. Certain data eviction policies can cause an agent to obtain access to stale data.
The state of a copy of data determines how it is evicted from a cache. When a cache is full, new data may not be stored in the cache until old data is “evicted” from the cache. For old data stored in an invalid, exclusive or shared state, eviction occurs simply by writing the new data over the old data. This copy of old data is lost, but it is guaranteed that the same, or possibly more up-to-date copy of data still exits somewhere within the system. For modified data, however, data eviction requires that the modified data be output to a higher layer cache or to a memory before it can be overwritten in the first cache. A copy of data in a modified state could be the only current copy of data in the system. If it were overwritten, the most up-to-date copy of the data could be lost to the system.
Under the MESI system, bus transactions from other agents are interpreted by a first agent
10
in one of two ways: The request may be interpreted as a “Go-to-Invalid” snoop which causes the agent
10
to mark all cached copies of the requested data as invalid. Alternatively, the request may be interpreted as a “Go-to-Shared” snoop which causes the agent
10
to mark cached copies of valid data as shared. In either case, when a snoop implicates modified data, the modified data is provided by the first agent
10
to the second agent
20
via an implicit writeback. The MESI state changes resulting from a Go-to-Shared snoop are shown in
FIG. 3
in a MESI system. The modified-to-shared transition may cause cache coherency to be broken within the agent.
An agent may obtain access to stale data as shown in the following example. An agent
10
reads in data, such as a counter, and modifies it. According to such a process, the data's initial value may be stored in the L1 cache in an exclusive state and the data's modified value may be stored in the LØ. The data is snooped by another agent
20
as part of a read request. The snoop is interpreted as a “Go-to-Shared” snoop in which the agent
10
marks all matching copies of the data as shared. Also, the current value of the modified copy is written back to the snooping agent
20
and memory
50
by an implicit writeback. Once the modified data is marked as shared, it is subject to the data eviction policies of ordinary shared data.
TABLE 1
Before Snoop
After Snoop
After Eviction
State
Data
State
Data
State
Data
External Memory 50
0
1
1
L1 Cache
E
0
S
0
S
0
L0 Cache
M
1
S
1
Overwritten
Later, the copy in the LØ cache may be overwritten. However, the stale data in the L1 cache remains. If the agent
10
were to require the data, the agent would obtain and use the stale data from the L1 cache. This causes coherency problems.
Earlier processors have addressed this cache coherency issue. The Pentium® Pro processor, commercially available from Intel Corporation, solved this cache coherency issue by marking as invalid all copies of requested data except one. First, it identified all copies of data that matched the requested data and marked all of them as shared. Second, if modified data were present, then it would go back and mark all stale copies as invalid. This two-step snoop state update also caused problems because it was not atomic. By marking stale data first as shared then as invalid, a small window of time existed when the processor core possibly could gain access to the stale data.
Accordingly, there is a need in the art for a cache coherency scheme in a multi-cache agent that prevents the agent from gaining access to stale data that may be stored in one or more of the agent's caches. Further, there is a need in the art for such a scheme that permits snoop state update; to be atomic.
SUMMARY OF THE INVENTION
Embodiments of the present invention provide a cache coherency scheme in which a copy of data may take one of five states including an invalid state, an exclusive state, a shared state, a modified
Bachand Derek
Breuder Paul
Kumar Harish kumar
Lince Brent E.
Merrill Quinn W.
Gossage Glenn
Intel Corporation
Kenyon & Kenyon
LandOfFree
“SLIME” cache coherency system for agents with... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with “SLIME” cache coherency system for agents with..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and “SLIME” cache coherency system for agents with... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2855526