Method and apparatus for history-based movement of...

Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S137000, C711S144000

Reexamination Certificate

active

06711651

ABSTRACT:

BACKGROUND
1. Technical Field
The present invention relates generally to memories in computer processing systems and, in particular, to a method and apparatus for history-based movement of shared-data in coherent cache memories.
2. Background Description
With respect to prefetching in coherent cache memories, the prior art corresponding thereto uses history information to help prefetch invalidated cache lines in a node/processor before the invalidated cache lines are used again. For example, the Cosmos “coherence message predictor” and the “memory sharing predictor”, predict the source and type of coherence messages for a cache line in a multiprocessor (MP) computer system using a complex prediction logic. The Cosmos coherence message predictor is described by Hill et al., in “Using Prediction to Accelerate Coherence Protocols”, Proceedings of the 25th Annual International Symposium on Computer Architecture (ISCA), Barcelona, Spain, June 27 through Jul. 2, 1998, pp. 179-90. The memory sharing predictor is described by Falsafi et. al, in “Memory Sharing Predictor: The Key to a Speculative Coherent DSM”, Proceedings of the 26th International Symposium Computer. Architecture (ISCA), Atlanta, Ga., May 2-4, 1999, pp. 172-83.
The goal of the two preceding predictors is to predict incoming messages that affect memory blocks and to timely execute the incoming messages. Thus, the two approaches are basically similar in that they are both based on the ability to predict likely messages in sequence and speculatively execute the messages. However, the two approaches suffer from a long learning overhead which is required to accurately predict “follow-on” messages.
Accordingly, it would be desirable and highly advantageous to have a method and apparatus for moving data in coherent cache memories which does not suffer from a long learning time overhead and yet can still predict necessary data movements and move such data to other processors early enough to avoid long latency cache misses and excessive coherence traffic.
SUMMARY OF THE INVENTION
The problems stated above, as well as other related problems of the prior art, are solved by the present invention, a method and apparatus for history-based movement of shared-data in coherent cache memories.
The invention reduces data misses and access latencies in coherent cache memories in a multiprocessor (MP) computer system. The invention employs mechanisms for causing the pushing of data from one node/processor to other nodes/processors in an MP computer system based on memory sharing patterns. The concept is to monitor and keep a history of which processors in the system consume a particular datum or data (hereinafter “data”) that is produced by another node/processor, and push the produced data (e.g., cache line(s)) to the consuming nodes/processors. Cache lines may be pushed from one cache to one or more other caches at other nodes/processors. Thus, the invention can reduce the latency associated with lateral interventions.
In contrast to the prior art, the invention is not concerned with the sequence in which messages arrive, and does not affect the way messages get executed. The invention simply identifies nodes/processors that are likely to use newly produced data, and attempts to timely move copies of the data closer to the consuming nodes/processors to reduce possible cache misses that must be intervened by other nodes/processors in the system. The invention uses a simple history gathering technique, and does not require a complex prediction mechanism to predict likely messages that must follow in series. The invention employs a data-centric approach that actively and aggressively moves data closer to the consuming processor(s), and incurs only a minimal learning overhead, in contrast to the prior art.
According to a first aspect of the invention, there is provided a method for moving at least one of instructions and operand data throughout a plurality of caches included in a computer system, wherein each of the plurality of caches is included in one of a plurality of processing nodes of the system. The method includes the step of storing a plurality of entries in a table attached to each of the plurality of caches, wherein each of the entries is associated with a plurality of storage elements in one of the plurality of caches and includes information of prior usage of the plurality of storage elements by each of the plurality of processing nodes.
According to a second aspect of the invention, there is provided a method for moving at least one of instructions and operand data throughout a plurality of caches included in a computer system, wherein each of the plurality of caches is included in one of a plurality of processing nodes of the system. The method includes the step of storing a plurality of entries in a table attached to each of the plurality of caches, wherein each of the entries is associated with a plurality of storage elements in one of the plurality of caches and includes information of prior usage of the plurality of storage elements by each of the plurality of processing nodes. Upon a miss by a given processing node to a given cache included therein, any given storage elements that caused the miss are transferred to the given cache from one of main memory and another cache. A given entry that is associated with the given storage elements is created in the table.
According to a third aspect of the invention, the method further includes the step of displacing at least one existing storage element in the given cache to make room for the given storage elements.
According to a fourth aspect of the invention, the method further includes the step invalidating an entry associated with the displaced at least one existing storage element, if the entry exists.
According to a fifth aspect of the invention, upon a processing node performing a store operation that updates at least one storage element in a cache included in the processing node, the method further includes the step of searching the table for an entry associated with the at least one storage element. If the entry is found, the at least one storage element is requested to be transmitted to any processing nodes identified in the entry based upon the information stored in the entry.
According to a sixth aspect of the invention, upon a request by a processing node for a storage element in a cache included in another processing node, the method further includes the step of searching the table for an entry corresponding to the requested storage element. If the entry is found, the information stored in the entry is updated to indicate that the requested storage element has been sent to the other processing node. If the entry is not found, a new entry is created in the table for the requested storage element.
According to a seventh aspect of the invention, the creating step includes the step of identifying a usage of the requested storage element by the other processing node.
According to an eighth aspect of the invention, upon a receipt of a storage element by a processing node from another processing node that recently updated the storage element, the method further includes the step of determining whether the received storage element is to be stored in the cache included in the processing node, based upon a current content of the cache.
These and other aspects, features and advantages of the present invention will become apparent from the following detailed description of preferred embodiments, which is to be read in connection with the accompanying drawings.


REFERENCES:
patent: 5214766 (1993-05-01), Liu
patent: 5829023 (1998-10-01), Bishop
patent: 6038644 (2000-03-01), Irie et al.
patent: 6065058 (2000-05-01), Hailpern et al.
patent: 6311187 (2001-10-01), Jeyaraman
Abdel-Shafi et al, “An Evaluation of Fine-Grain Producer Initiated Communication in Cache Coherent Multiprocessors,” 1997, Int'l. Sympoon High Performance Computer Architecture, Feb. 1-5, 1997, pp. 204-215.*
Hauswirth et al, “A Component and Communication Model For Push Systems,” ACM Sigsoft Software Engin

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for history-based movement of... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus for history-based movement of..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for history-based movement of... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3255594

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.