Method and system for using high count invalidate...

Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S144000, C711S120000

Reexamination Certificate

active

06718442

ABSTRACT:

TECHNICAL FIELD OF THE INVENTION
The present invention relates in general to multi-processor computer system operation and more particularly to a method and system for using high count invalidate acknowledgements in distributed shared memory systems.
BACKGROUND OF THE INVENTION
In a shared-memory multiprocessor computer system with per-processor caches, it is desirable for hardware to keep the caches coherent. This means that when one or more processors have a copy of a block or line of memory and one of the processors performs a write to that line, the cached copies must either be updated or invalidated.
Computer systems use memory directories that are associated with main memory. In large systems, memory, and thus the directories, may be physically distributed throughout the system. The directory associated with a region of memory keeps track of which processors, if any, have copies of memory lines in that region. When a write occurs to a line, the directory is consulted, and updates or invalidates are sent to any processors that have a copy of the line.
A large computer system is conventionally implemented with a large number of processors accessed through node controllers at node locations. The node controllers include the memory directories employing coarse directory protocols. A basic problem with memory directories is that the required storage does not scale well. Coarse directory protocols provide a technique that represents each processor in the computer system by saving space in the memory directory. Space is saved by grouping node controllers and associated processors that share information in memory. When it becomes necessary to invalidate all nodes with a shared copy of a cache line, invalidate commands are sent to all of the nodes within a group that includes the node that contains the shared copy of the memory. Typically, each node processes the invalidation command and sends an acknowledgment message back to the node that originated the invalidation command. Since full operation of the computer system does not continue until all expected acknowledgment messages are received, each node in a group must be present and operational for the computer system to work effectively. However, there may be situations where certain nodes of a group may not be present, may be in a failure state or simply may not share the line.
One way to reduce directory overhead is to limit the size of the directory entries such that an entry cannot represent any arbitrary set of processors in the system. The system is then either prohibited from allowing non-representable sets of processors to cache a line concurrently (by, say, invalidating the copies of certain processors when other processors obtain a copy of the line), or, more preferably, when a non-representable set of sharing processors occurs, the directory entry is set to represent some superset of the sharing processors. Then when the line is written, an invalidation or update message is sent to the superset of processors caching the line.
A goal of a directory structure for a large multiprocessor system is to use a modest number of bits in a directory entry, yet minimize the number of “spurious” invalidation messages that must be sent when a line is written. That is, keep the superset as close to the size of the actual set of sharers as possible.
At one end of the spectrum is a full broadcast mechanism. In this scheme, as soon as a line becomes cached by any processor (or perhaps only when it is cached by more than one processor), the state of the corresponding directory is set to indicate that a broadcast is necessary. When the line is written, invalidations are sent to all processors in the system. This mechanism minimizes the number of bits needed in the directory entry, but maximizes spurious invalidations.
At the other end of the spectrum is the full bit-vector mechanism described above, in which a directory entry includes a bit for each processor in the system. This maximizes directory storage overhead, but eliminates spurious invalidations. The storage overhead for this scheme is unacceptable for large systems.
A reasonable middle ground is a “coarse-vector” directory structure like the one used in the Origin 2000 manufactured by Silicon Graphics In. of Mountain View, Calif. The directory structure in the Origin 2000 includes a bit vector of size v in the directory entries (where v=32). Each bit represents one or more processor nodes in the system. For systems with thirty-two or fewer nodes, this size bit vector acts like a full bit vector. For larger numbers of nodes, however, the vector can become “coarse”. When the set of processor nodes sharing a line is contained within an aligned block of consecutive processor nodes, then the bit vector can still be used as a full bit vector, with another small field in the directory entry specifying the block of processor nodes the vector represents. Processor nodes will typically contain one or more processors. In the Origin 2000, each processor node includes two processors.
When the set of processor nodes expands beyond an aligned block of v processor nodes, however, the meaning of the bits in the vector is changed (this is recorded in the state information in the entry). For N-processor-node systems, each bit in the vector now represents N/v processor nodes. For example, in a 512-processor node system with a 32-bit vector, each bit in the vector represents sixteen processor nodes. For every processor node caching the line, the bit representing the set of processor nodes containing that processor node would be set. When the line is written, for each bit that is set in the coarse vector, invalidation messages are sent to the corresponding set of N/v processor nodes. In most cases, this will cause invalidations to be sent to some processor nodes that do not have a copy of the line (spurious invalidates).
Another way of compacting directory entries, in contrast to the single bit vector approach discussed above, is a multi-dimensional directory structure approach which uses two or more bit vectors to track each processor node. Such an approach is described in U.S. patent application Ser. No. 08/971,184, filed Nov. 17, 1997, now pending, entitled Multi-Dimensional Cache Coherence Directory Structure, the entire disclosure of which is incorporated herein by reference. In the multi-dimensional approach, each processor node is represented by a bit in each of the two or more bit vectors.
For example, for a system with n vectors as each processor obtains a copy of a shared line, the bits corresponding to that processor are set in each of the n vectors of the associated directory entry. If n−1 of the vectors have only a single bit set, then the pointer information remains exact. If more than one bit is set in at least two of the fields, however, then the pointer information can become inexact, or approximate. As a result, in addition to invalidation messages being sent to shared processor nodes reading the messages, spurious invalidation messages are sent to alias processor nodes (i.e., processors the system initially interprets as sharing a line of memory that in reality do not share the line and indeed may not even exist) not needing the messages. As the number of alias processor nodes increases, determining the actual processors needing the invalidation messages becomes more cumbersome.
Typical computer systems merely allowed the problem of sending invalidation messages to non-existent nodes to occur or simply wasted directory space to handle the problem. Therefore, it is desirable to provide a technique to process invalidation commands for nodes with alias processors.
One method of handling invalidation requests to processors not present or operational in a computer system is described in U.S. patent application Ser. No. 09/410,139, filed Sep. 30, 1999, now pending, entitled Method and Apparatus for Handling Invalidation Requests to Processors Not Present in a Computer System, the entire disclosure of which is incorporated by reference herein. The method disclosed in U.S. a

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and system for using high count invalidate... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and system for using high count invalidate..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for using high count invalidate... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3250939

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.