Reducing probe traffic in multiprocessor systems using a...

Electrical computers and digital processing systems: memory – Storage accessing and control – Control technique

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S155000, C711S141000, C711S145000, C709S213000, C709S214000

Reexamination Certificate

active

06757793

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention is related to computer systems and, more particularly, to coherency mechanisms within computer systems.
2. Description of the Related Art
Typically, computer systems include one or more caches to reduce the latency of a processor's access to memory. Generally, a cache may store one or more blocks, each of which is a copy of data stored at a corresponding address in the memory system of the computer system.
Since a given block may be stored in one or more caches, and further since one of the cached copies may be modified with respect to the copy in the memory system, computer systems often maintain coherency between the caches and the memory system. Coherency is maintained if an update to a block is reflected by other cache copies of the block according to a predefined coherency protocol. Various coherency protocols are used. As used herein, a “block” is a set of bytes stored in contiguous memory locations which are treated as a unit for coherency purposes. In some embodiments, a block may also be the unit of allocation and deallocation in a cache. The number of bytes in a block may be varied according to design choice, and may be of any size. As an example, 32 byte and 64 byte blocks are often used.
Many coherency protocols include the use of probes to communicate between various caches within the computer system. Generally speaking, a “probe” is a message passed from the coherency point in the computer system to one or more caches in the computer system to determine if the caches have a copy of a block and optionally to indicate the state into which the cache should place the block. The coherency point may transmit the probes in response to a command from a component (e.g. a processor) to read or write the block. Each probe receiver responds to the probe, and once the probe responses are received the command may proceed to completion. The coherency point is the component responsible for maintaining coherency, e.g. a memory controller for the memory system.
Unfortunately, probes increase the bandwidth demands on the computer system and may increase the latency of the commands. Bandwidth demands are increased because the probes are transmitted through the interconnect of the computer system. Latency may increase because the probe responses are needed to verify that the data to be provided in response to the command is the correct copy of the block (i.e. that no cache stores an updated copy of the block). Accordingly, it is desirable to reduce the probe traffic in a computer system while still maintaining coherency.
SUMMARY OF THE INVENTION
The problems outlined above are in large part solved by a victim record table as described herein. The victim record table records victim blocks which have been returned from a cache to memory and which are not currently cached in any other caches. If a command affecting a block recorded in the victim record table is received, one or more probes corresponding to the command may be inhibited even if probes would ordinarily be transmitted for the command. Advantageously, system bandwidth which would be consumed by the probes may be conserved. Furthermore, since probes are inhibited, the latency of the command may be reduced since the command may be completed without waiting for any probe responses.
Since probes are selectively inhibited if an affected block is recorded in the victim record table, the size of the victim record table may be flexible. If a particular block is not represented in the victim record table, probes are performed when the particular block is accessed (even if the particular block could have been represented in the victim record table but is not because of a limited number of records). Thus, coherency is maintained even if every uncached block is not represented in the victim record table. Accordingly, the victim record table may be sized according to cost versus performance tradeoffs (and not according to concerns about correctly maintaining coherency).
Broadly speaking, an apparatus is contemplated, comprising a table and a control circuit. The table is configured to store a plurality of records, wherein a first record of the plurality of records is configured to identify a first block previously received as a victim block by a memory controller. Coupled to the table, the control circuit is configured to inhibit issuance of one or more probes for a first read command responsive to the first read command accessing the first block.
Additionally, a computer system is contemplated comprising a memory, a memory controller coupled to the memory, and a source coupled to the memory controller. The memory controller includes a table configured to store a plurality of records, wherein a first record of the plurality of records is configured to identify a first block previously received by the memory controller as a victim block. The source is configured to transmit a first read command to the memory controller. The memory controller is configured to inhibit issuance of one or more probes for the first read command responsive to the first read command accessing the first block.
Still further, a method is contemplated. A table having a plurality of records is maintained, wherein each record of the plurality of records is configured to identify a respective block previously received by a memory controller as a victim block. One or more probes are selectively issued for a first read command responsive to whether or not a first block accessed by the first read command is identified by a first record of the plurality of records.


REFERENCES:
patent: 5303362 (1994-04-01), Butts, Jr. et al.
patent: 5673413 (1997-09-01), Deshpande et al.
patent: 5749095 (1998-05-01), Hagersten
patent: 6061765 (2000-05-01), Van Doren et al.
patent: 6085292 (2000-07-01), McCormack et al.
patent: 6101420 (2000-08-01), VanDoren et al.
patent: 6101581 (2000-08-01), Doren et al.
patent: 6105108 (2000-08-01), Steely et al.
patent: 6202126 (2001-03-01), Van Doren et al.
patent: 6295583 (2001-09-01), Razdan et al.
patent: 6529999 (2003-03-01), Keller et al.
patent: 6631401 (2003-10-01), Keller et al.
patent: 0 379 771 (1990-08-01), None
Lenoski, “The Design and Analysis of Dash: A Scalable Directory-Based Multiprocessor,” Dec. 1991, pp. 1-176.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Reducing probe traffic in multiprocessor systems using a... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Reducing probe traffic in multiprocessor systems using a..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Reducing probe traffic in multiprocessor systems using a... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3324030

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.