System for minimizing directory information in scalable...

Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S148000, C711S151000, C711S156000, C711S159000

Reexamination Certificate

active

06738868

ABSTRACT:

RELATED APPLICATIONS
This application is related to, and hereby incorporates by reference, the following U.S. patent applications:
Multiprocessor Cache Coherence System And Method in Which Processor Nodes And Input/output Nodes Are Equal Participants, Ser. No. 09/878,984, filed Jun. 11, 2001;
Scalable Multiprocessor System And Cache Coherence Method, Ser. No. 09/878,982, filed Jun. 11, 2001;
System and Method for Daisy Chaining Cache Invalidation Requests in a Shared-memory Multiprocessor System, Ser. No. 09/878,985, filed Jun. 11, 2001;
Cache Coherence Protocol Engine And Method For Processing Memory Transaction in Distinct Address Subsets During Interleaved Time Periods in a Multiprocessor System, Ser. No. 09/878,983, filed Jun. 11, 2001;
System And Method For Generating Cache Coherence Directory Entries And Error Correction Codes in a Multiprocessor System, Ser. No. 09/972,477, filed Oct. 5, 2001, which claims priority on U.S. provisional patent application 60/238,330, filed Oct. 5, 2000, which is also hereby incorporated by reference in its entirety.
FIELD OF INVENTION
The present invention relates generally to the design of cache coherence protocol directories, and particularly to the minimization of directory information required in the context of logically independent input/output nodes.
BACKGROUND OF THE INVENTION
When multiple processors with separate caches share a common memory, it is necessary to keep the caches in a state of coherence by ensuring that cached copies of shared memory lines of information are invalidated when changed by another processor. This is done in either of two ways: through a directory-based or a snooping system. In a directory-based system, sharing information is placed in a directory that maintains the coherence between caches. The directory acts as a filter through which a processor must ask permission to load an entry from a primary memory to a cache. In a snooping system (i.e., snoop based) each cache monitors (i.e., snoops) a bus for requests for memory lines of information broadcast on the bus, and responds if able to satisfy the request.
Additionally, the common bus-based design for most small-scale multiprocessor systems is not used for larger-scale multiprocessors because current buses do not accommodate the bandwidth requirements of high performance processors typically included in larger-scale multiprocessors systems. Large-scale multiprocessor systems, therefore, use a more scalable interconnect that provides point-to-point connections between processors.
However, the more scalable interconnect does not include broadcast capabilities. The large-scale multiprocessors cannot, therefore, use a snoop based cache-coherence protocol.
Instead, large-scale multiprocessors typically use a directory-based cache coherence protocol. As indicated above, a directory is a cache-coherence protocol data structure that maintains information about which processors are caching one or more lines memory lines of information in the system. This information is used by the cache-coherence protocol to invalidate cached copies of a memory line of information when the contents of the memory line of information are modified (i.e., subject to a request for exclusive ownership). A common directory implementation is to use a full bit vector, wherein each bit indicates whether a corresponding processor is caching a copy of an associated memory line of information.
Furthermore, large-scale multiprocessor systems typically include input/output (I/O) devices that are connected to one or more processor nodes, which manage any connected I/O devices and process requests from other processor nodes directed to any connected I/O devices.
There are two alternatives with respect to how data maintained by an I/O device is accessed by other processor nodes. In some large-scale multiprocessor systems, no distinction is made between a processor included in a processor node or an I/O device connected to the processor node. In these systems, the processor node determines whether a particular request is routed to an included processor or a connected I/O device.
In other large-scale multiprocessor systems, requests indicate whether the request is directed to a processor or an I/O device. In these systems, a directory must include information that distinguishes between processors and I/O devices.
In still other large-scale multiprocessor systems, I/O devices are connected “directly” to the network that interconnects the processor nodes of the multiprocessor system (“interconnection network”) through I/O nodes. The I/O devices connected to the I/O nodes are, therefore, accessed efficiently by all processor nodes. More specifically, the ability to access an I/O device is not limited by the ability of a processor node to process requests directed to a connected I/O device and requests directed to an included processor. These I/O nodes typically include caches to reduce the need to transfer data to and from other processor and I/O nodes and, therefore, participate in the cache-coherence protocol.
In balanced, large-scale multiprocessor systems, the number of I/O nodes is equal to, or nearly equal to, the number of processor nodes. Requiring directories to include information to distinguish between I/O and processor nodes requires, therefore, a potentially large increase in the size of the directories. This is particularly true for full bit vectors, in which each bit is never associated with more than one node. In such systems, the directories include perfect sharing information (i.e., each node sharing a memory line of information is identifiable). For example, if the number of I/O nodes equals the number of processor nodes and an extra bit is required for each of the I/O nodes, the size of the directory roughly doubles.
But the addition of I/O nodes is also an issue for systems that support coarse-vector directory formats. In such systems, the issue is not additional directory bits, but rather the coarseness of the directory entries. As described more fully below, a single bit in a directory using the coarse-vector format may be associated with one or more nodes. Increasing the number of nodes but not the number of bits results in an increase in the number of nodes associated with each such bit. As a result, a greater number of invalidation acknowledgments are required when an exclusive request is received, even though only one of the nodes associated with a given bit actually shares the corresponding memory line of information.
Thus, connecting I/O devices “directly” to the interconnection network of a large-scale multiprocessor system through I/O nodes presents problems for directory structures regardless of the particular directory format used.
Another important observation is the distinction between the way in which a processor node and an I/O node access memory lines of information. I/O nodes (i.e., I/O devices) do not typically access the same data over and over, as is the case with processor nodes. Instead, I/O nodes tend to access data sequentially and use caches to exploit the spatial locality in their accesses. In other words, caches improve the performance of I/O nodes by ensuring that there is only one miss per memory line of information as the I/O nodes sequentially access data. Once an I/O node has accessed all the data in a particular memory line of information, the I/O node will typically not access the same memory line of information in the near term. The present invention exploits this aspect of I/O nodes to conserve resources allocated to manage the sharing of memory lines of information by I/O nodes without substantially impacting the performance of the I/O nodes.
SUMMARY OF THE INVENTION
A system of scalable shared-memory multiprocessors includes processor nodes and I/O nodes. The I/O nodes connect I/O devices directly to an interconnection network of a system of scalable shared-memory multiprocessors. Each node of the system includes an interface to a local memory subsystem, a memory cache and a protocol engine. The local memory subsystem stores memory lines of

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System for minimizing directory information in scalable... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System for minimizing directory information in scalable..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System for minimizing directory information in scalable... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3190978

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.