Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories
Reexamination Certificate
2000-02-25
2002-05-14
Kim, Hong (Department: 2185)
Electrical computers and digital processing systems: memory
Storage accessing and control
Hierarchical memories
C711S144000, C711S131000, C711S167000, C711S155000, C711S168000
Reexamination Certificate
active
06389517
ABSTRACT:
BRIEF DESCRIPTION
The present invention relates generally to snoop filtering, and particularly to an apparatus and method for snoop filtering during an atomic operation.
BACKGROUND
FIG. 1
illustrates, in block diagram form, a typical prior art multi-processor System
30
. System
30
includes a number of Processors,
32
a
,
32
b
,
32
c
, coupled via a shared Bus
35
to Main Memory
36
. Each Processor
32
has its own non-blocking Cache
34
, which is N-way set associative. Each cache index includes data and a tag to identify the memory address with which the data is associated. Additionally, coherency bits are associated with each item of data in the cache to indicate the cache coherency state of the data entry. According to the MOSI cache coherency protocol, each cache data entry can be in one of four states: M, O, S, or I. The I state indicates invalid data. The owned state, O, indicates that the data associated with a cache index is valid, has been modified from the version in memory, is owned by a particular cache and that another cache may have a shared copy of the data. The processor with a requested line in the O state responds with data upon request from other processors. The shared state, S, indicates that the data associated with a cache index is valid, and one or more other processors share a copy of the data. The modified state, M, indicates valid data that has been modified since it was read into cache and that no other processor has a copy of the data.
Cache coherency states help determine whether a cache access request is a miss or a hit. A cache hit occurs when one of the ways of a cache index includes a tag matching that of the requested address and the cache coherency state for that way does not indicate invalid data. A cache miss occurs when none of the tags of an index set matches that of the requested address or when the way with a matching tag contains invalid data.
FIG. 2
illustrates how MOSI cache coherency states transition in response to various types of misses. The events causing transitions between MOSI states are indicated using the acronyms IST, ILD, FST and FLD. As used herein, “ILD” indicates an Internal LoaD; i.e., a load request from the processor associated with the cache. Similarly, IST indicates an Internal STore. “FLD” indicates that a Foreign LoaD caused the transition; i.e, a load request to the cache coming from a processor not associated with cache, and “FST” indicates a Foreign STore.
“Snooping” refers to the process by which a processor in a multi-processor system determines whether a foreign cache stores a desired item of data. As used herein, a snoop represents a potential, future request for an eviction, e.g., a FLD or a FST, on a particular address. Each snoop indicates the desired address and operation. Every snoop is broadcast to every Processor
32
within System
30
, but only one Processor
32
responds to each snoop. The responding Processor
32
is the one associated with the Cache
34
storing the data associated with the desired address. Each Processor
32
within System
30
includes an External Interface Unit (EIU), which handles snoop responses.
FIG. 3
illustrates, in block diagram form, EIU
40
and its coupling to Bus
35
and Cache
34
. EIU
40
receives snoops from Bus
35
. EIU
40
forwards each snoop onto Cache Controller
42
, which stores the snoop in Request Queue
46
until it can be filtered. Snoop filtering involves determining whether a snoop hits or misses in Cache
34
and indicating that to EIU
40
. Given the architecture of
FIG. 3
, the latency between receipt of a snoop by EIU
40
and a response to it can be quite long under the best of circumstances. Snoop latency usually increases from its theoretical minimum in response to other pending cache access requests, such as a pending atomic operation, for example. An atomic operation refers to a computational task that should be completed without interruption. Processors
32
typically implement atomic operations as two sub-operations on a single address, one sub-operation on the address following the other without interruption. One atomic operation, for example, is an atomic load, which is a load followed immediately and without interruption by a store to the same address. To protect the data associated with an atomic operation during the pendency of the atomic operation, some processors cease filtering snoops, even though most snoops are for addresses other than that associated with the pending atomic operation. Two factors necessitate this approach. First, Cache includes a single data-and-tag read-write port, which, in response to a hit permits modification of both a cache line's data and tag. Second, most processors respond to a snoop hit by immediately beginning data eviction. This is unacceptable during an atomic operation, therefore all access to Cache
37
is halted during the pendency of the atomic operation. However, the pendency of the atomic operation may so long that EIU
40
is forced to back throttle snoops. Other operations may also cause a processor to cease snoop filtering without regard to the addresses to be snooped. Thus, a need exists for an improved apparatus and method for filtering snoops independent of other pending cache access requests.
SUMMARY
The apparatus of the present invention permits snoop filtering to continue while an atomic operation is being executed. The snoop filtering apparatus includes first and second request queues and a cache. The first request queue tracks cache access requests, while the second request queue tracks snoops that have yet to be filtered. The cache includes a dedicated port for each request queue. The first port is dedicated to the first request queue and is a data-and-tag port, permitting modification of cache contents. In contrast, the second port is dedicated to the second request queue and is a tag-only port. Because the second port is a tag-only port, snoop filtering can continue during an atomic operation without fear of any modification of the data associated with the atomic address.
REFERENCES:
patent: 5163140 (1992-11-01), Stiles et al.
patent: 5355467 (1994-10-01), Mac Williams et al.
patent: 5428761 (1995-06-01), Herlihy et al.
patent: 5706464 (1998-01-01), Moore et al.
patent: 5923898 (1999-07-01), Genduso et al.
patent: 5966729 (1999-10-01), Phelps
patent: 6073212 (2000-06-01), Hayes et al.
patent: 6098156 (2000-08-01), Lenk
patent: 6145054 (2000-11-01), Mehrotra et al.
patent: 6182201 (2001-01-01), Arimilli et al.
patent: 6209067 (2001-03-01), Collins et al.
patent: 6237064 (2001-05-01), Kumar et al.
Kuttanna Belliappa M.
Moudgal Anuradha N.
Tzeng Allan
Kim Hong
Pennie & Edmonds LLP
LandOfFree
Maintaining snoop traffic throughput in presence of an... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Maintaining snoop traffic throughput in presence of an..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Maintaining snoop traffic throughput in presence of an... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2862049