Electrical computers and digital data processing systems: input/ – Input/output data processing – Input/output data buffering
Reexamination Certificate
1999-05-28
2002-08-13
Peikari, B. James (Department: 2186)
Electrical computers and digital data processing systems: input/
Input/output data processing
Input/output data buffering
C710S052000, C710S039000, C711S003000, C711S150000
Reexamination Certificate
active
06434641
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates generally to a system for managing processor requests made to a shared main memory system that utilizes a directory-based cache coherency scheme; and, more specifically, to a system that utilizes information associated with previously deferred memory requests to determine that certain subsequently-received memory requests should also be temporarily deferred such that redundant memory coherency actions are prevented from being unnecessarily initiated, and so that memory operation is optimized.
2. Description of the Prior Art
Data processing systems are becoming increasing complex. Some systems, such as Symmetric Multi-Processor (SMP) computer systems, couple two or more Instruction Processors (IPs) and multiple Input/Output (I/O) Modules to shared memory systems. This allows the multiple IPs to operate simultaneously on the same task, and also allows multiple tasks to be performed at the same time to increase system throughput.
As the number of units coupled to a shared memory increases, more demands are placed on the memory and memory latency increases. To address this problem, high-speed cache memory systems are often coupled to one or more of the IPs for storing data signals that are copied from main memory. These cache memories are generally capable of processing requests faster than the main memory while also serving to reduce the number of requests that the main memory must handle. This increases system throughput.
While the use of cache memories increases system throughput, it causes other design challenges. When multiple cache memories are coupled to a single main memory for the purpose of temporarily storing data signals, some system must be utilized to ensure that all IPs are working from the same (most recent) copy of the data. For example, if a copy of a data item is stored, and subsequently modified, in a cache memory, another IP requesting access to the same data item must be prevented from using the older copy of the data item stored either in main memory or the requesting IP's cache. This is referred to as maintaining cache coherency. Maintaining cache coherency becomes more difficult as more caches are added to the system and more copies of a single data item must be managed.
Many methods exist to maintain cache coherency. Some earlier systems achieve coherency by implementing memory locks. That is, if an updated copy of data existed within a local cache, other processors are prohibited from obtaining a copy of the data from main memory until the updated copy was returned to main memory, thereby releasing the lock. For complex systems, the additional hardware and/or operating time required for setting and releasing the locks within main memory cannot be justified. Furthermore, reliance on such locks directly prohibits certain types of applications such as parallel processing.
Another method of maintaining cache coherency is shown in U.S. Pat. No. 4,843,542 issued to Dashiell et al., and in U.S. Pat. No. 4,755,930 issued to Wilson, Jr. et al. These patents each discuss a system wherein a processor having a local cache is coupled to a shared memory through a common memory bus. Each processor is responsible for monitoring, or “snooping”, the common bus to maintain coherency of its own cache data. These snooping protocols increase processor overhead, and are unworkable in hierarchical memory configurations that do not have a common bus structure.
A similar snooping protocol is shown in U.S. Pat. No. 5,025,365 to Mathur et al., which teaches local caches that monitor a system bus for the occurrence of memory accesses which would invalidate a local copy of data. The Mathur snooping protocol removes some of overhead associated with snooping by invalidating data within the local caches at times when data accesses are not occurring, however the Mathur system is still unworkable in memory systems without a common bus structure.
Another method of maintaining cache coherency is shown in U.S. Pat. No. 5,423,016 to Tsuchiya. The method described in this patent involves providing a memory structure called a “duplicate tag” with each cache memory. The duplicate tags record which data items are stored within the associated cache. When a data item is modified by a processor, an invalidation request is routed to all of the other duplicate tags in the system. The duplicate tags are searched for the address of the referenced data item. If found, the data item is marked as invalid in the other caches. Such an approach is impractical for distributed systems having many caches interconnected in a hierarchical fashion because the time requited to route the invalidation requests poses an undue overhead.
For distributed systems having hierarchical memory structures, a directory-based coherency system becomes more practical. Directory-based coherency systems utilize a centralized directory to record the location and the status of data as it exists throughout the system. For example, the directory records which caches have a copy of the data, and further records if any of the caches are allowed to have an updated copy of the data. When a cache makes a request to main memory for a data item, the central directory is consulted to determine where the most recent copy of that data item resides. Based on this information, the most recent copy of the data is retrieved so it may be provided to the requesting cache. The central directory is then updated to reflect the new status for that unit of memory. A novel directory-based cache coherency system for use with multiple Instruction Processors coupled to a hierarchical cache structure is described in the co-pending application entitled “Directory-Based Cache Coherency System Supporting Multiple Instruction Processor and Input/Output Caches”, Ser. No. 09/001,598 filed Dec. 31, 1997, which is incorporated herein by reference in its entirety.
As stated above, a main memory employing a directory-based coherency system is a practical way to maintain coherency within a hierarchical memory that includes multiple levels of cache. Moreover, this type of coherency system may be readily expanded to maintain coherency among a large number of cache memories. One problem with this type of coherency scheme, however, is that as the number of cache memories within the system increases, a larger percentage of the main memory bandwidth is consumed in the handling and management of various memory coherency actions. For example, a first processor may have the latest cached copy of a data item requested by the second processor. The main memory must initiate an operation to retrieve the data copy from the first processor before the request may be processed. In the mean time, a third processor may request the same data item from main memory, causing the main memory to again initiate an operation to attempt to retrieve the most recent data copy.
Not only does the initiation of coherency actions consume memory cycles, but it also requires the use of other system resources as well. The scheduling of requests for causing a processor to return data to the main memory requires the use of various queue structures within the memory control system. These requests must be processed by the memory controllers, and ultimately transferred across memory bus resources to the various cache memories. The cache memories process the requests and schedule the return of requested data to memory. This return operation again requires the use of memory bus resources.
As can be readily appreciated by the foregoing discussion, in a hierarchical memory employing a directory-based cache coherency structure, the occurrence of coherency operations decreases the rate at which the memory can process requests. The problem increases when multiple processors are grouped together to work on a single task that requires the sharing of data, If multiple processors are each requesting the use of the same data item within a short period of time, coherency actions are initiated that may significantly impact memory throughput.
The p
Bauman Mitchell A.
Haupt Michael L.
Atlass Michael B.
Johnson Charles A.
McMahon Beth L.
Peikari B. James
Unisys Corporation
LandOfFree
System for reducing the number of requests presented to a... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with System for reducing the number of requests presented to a..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System for reducing the number of requests presented to a... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2907485