Split pending buffer with concurrent access of requests and...

Electrical computers and digital processing systems: memory – Storage accessing and control – Shared memory area

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S156000, C711S168000, C711S149000, C711S131000

Reexamination Certificate

active

06405292

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to systems for processing memory requests in cache coherent distributed shared memory multiprocessor servers and, more specifically, to a system for achieving more efficient processing throughput and parallelism in the protocol processing of coherence operations in the coherence controllers of such servers.
2. Discussion of the Prior Art
The coherence controllers of cache coherent distributed shared memory multiprocessor systems use pending buffers to maintain the status of memory transactions in progress. They also provide a means for detecting collisions (i.e., multiple concurrent operations to the same memory address).
FIG. 1
illustrates a representation of such a multiprocessor system comprising a network
100
including one or more compute nodes
110
a
, . . . ,
110
n
. Each compute node includes one or more processors with associated caches
120
a
, . . . ,
120
n
, one or more shared/remote caches
130
a
, . . . ,
130
n
, one or more main memory modules
140
a
, . . . ,
140
n
, at least one memory directory
150
a
, . . . ,
150
n
, at least one coherence controller
160
a
, . . . ,
160
n
, and several I/O devices (not shown).
FIG. 2
illustrates a representation of the coherence controller
160
(also
200
), including at least one network interface
210
, at least one node interface
220
, input operation queues
230
, and one or more protocol engines
240
. Each protocol engine
240
typically includes one pending buffer
250
, one directory controller
260
, and at least one protocol handling unit
270
.
FIG. 3
illustrates a representation of a pending buffer
250
(also
300
) which comprises “e” pending buffer entries
310
, labeled entry
0
to entry e-
1
, where “e” is usually a power of 2. Each pending buffer entry
310
includes a valid bit
320
, that indicates whether the contents of the entry are in use or not, an address
330
that indicates the memory line to which the pending buffer entry corresponds if the valid bit is set to ‘TRUE’, and other status fields
340
that indicate the status of the addressed memory line. Examples of status fields
340
, to name a few, are current directory state, directory state at beginning of transaction, requester id, pending buffer index of original incoming request, identifiers (“ids”) of nodes expected to send responses, ids of nodes whose responses have been received, etc.
It is the case that pending buffers allow memory transactions to be split into request and response components that can be processed asynchronously without preventing the coherence controllers from processing other memory coherence transactions in between requests and responses. Pending buffers are generally implemented as small fully-associative buffers that can be quickly queried to determine if other transactions for the same memory address are in progress (collision). If the coherence controller is multiple issue (i.e. capable of receiving multiple operations in one cycle), the pending buffer needs to be multi-ported to match the throughput of the coherence controller and allow it to achieve its peak throughput. Unfortunately, implementing multi-ported fully-associative buffers is impractical unless the buffers are very small (smaller than the sizes that are usually needed for pending buffers). Thus there exists the need for a different organization of the pending buffer to enable processing of multiple operations concurrently.
SUMMARY OF THE INVENTION
It is an object of the present invention to provide a split pending buffer including a first fully-associative part and a second indexed part that is multi-ported.
The present invention leverages the fact that only a part of the pending buffer needs to be accessed associatively. In almost all systems, responses to the coherence controller always correspond to a request originating from that same coherence controller. Thus, according to the invention, by including the index of the pending buffer entry associated with a request in the associated one or more outgoing request messages, and by including the same index in incoming response messages associated with that request, the original coherence controller can use the index to locate the associated pending buffer entry without need for accessing the valid bit and address fields of the pending buffer and without need for associative lookup of the pending buffer searching for a matching address in a valid entry.
Consequently, associative lookups in the pending buffer are only needed for the processing of incoming requests in order to check for collision and in the absence of collision to allocate a new pending buffer entry. According to the present invention as incoming responses include the index of the intended pending buffer entry, no associative lookup in the pending buffer is needed.
Furthermore, since associative lookup is only needed for the address and valid bit fields of the pending buffer entries, only a part of the pending buffer needs to be implemented using fully-associative memory, while the rest can be implemented using indexed multi-ported memory. Also, only incoming requests need to perform directory lookups, while responses only use the directory state as maintained in the status fields of the corresponding entry in the pending buffer.
Allowing coherence controllers to handle multiple operations concurrently provides a dramatic increase of their throughput, and a clear significant effect of increased coherence controller throughput improves overall system performance for distributed shared memory servers.
Accordingly, the present invention is directed to splitting the pending buffer into two components: a fully-associative part and an indexed part that can easily be made multi-ported.
The associative part, Pending Buffer Address, hereinafter labeled PBA, contains the valid bit and address fields and the indexed part, Pending Buffer Content. hereinafter labeled PBC, includes all the other status fields (i.e., the content part of the pending buffer entries). The split multi-ported pending buffer thus enables one request and one or more responses to be handled concurrently. Handling a request requires an associative lookup of PBA, a directory lookup, a possible read of PBC (in case of collision), and after processing the request in a request protocol handling unit, a possible PBA update, a possible PBC update, and a possible directory update, depending upon the cache coherence protocol implemented. Handling a response requires no PBA lookup, no directory lookup, a PBC read, and after processing the response in a response protocol handling unit, a possible PBA update, a possible PBC update, and a possible directory update, depending upon the cache coherence protocol implemented.
For example to allow a pending buffer to be able to handle one request and one response concurrently, the PBC may be implemented using a memory module with a read port and a write port. For each additional response to be handled concurrently an extra read and write port pair is needed.


REFERENCES:
patent: 5701422 (1997-12-01), Kirkland, Jr. et al.
patent: 5745732 (1998-04-01), Cherukuri et al.
patent: 5752260 (1998-05-01), Liu
patent: 5761712 (1998-06-01), Tran et al.
patent: 5809536 (1998-09-01), Young et al.
patent: 5848434 (1998-12-01), Young et al.
patent: 5860120 (1999-01-01), Young et al.
patent: 5895487 (1999-04-01), Boyd et al.
patent: 5991819 (1999-11-01), Young
patent: 6078997 (2000-06-01), Young et al.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Split pending buffer with concurrent access of requests and... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Split pending buffer with concurrent access of requests and..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Split pending buffer with concurrent access of requests and... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2959842

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.