System and method for providing the speculative return of...

Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S143000, C711S145000

Reexamination Certificate

active

06457101

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates generally to an improved hierarchical memory system shared between multiple processors; and more particularly, relates to a memory system that performs transfers of cached data between hierarchical levels of the memory in anticipation of receiving requests to retrieve the data, the transfers being performed so that the data is more readily available to the requester when the anticipated request is received.
2. Description of the Prior Art
Data processing systems are becoming increasing complex. Some systems, such as Symmetric Multi-Processor (SMP) computer systems, couple two or more Instruction Processors (IPs) and multiple Input/Output (I/O) Modules to shared memory. This allows the multiple IPs to operate simultaneously on the same task, and also allows multiple tasks to be performed at the same time to increase system throughput. As the number of units coupled to a shared memory increases, more demands are placed on the memory and memory latency increases. To address this problem, high-speed local memory systems, including caches and high-speed I/O buffer memories, are often coupled to one or more of the IPs for storing data signals that are copied from main memory. These memories are generally capable of processing requests faster than the main memory while also serving to reduce the number of requests that the main memory must handle. This increases system throughput.
While the use of local memories increases system throughput, it causes other design challenges. When multiple local memories are coupled to a single main memory for the purpose of temporarily storing data signals, some system must be utilized to ensure that all IPs and I/O Modules are working from the same (most recent) copy of the data. For example, if a copy of a data item is stored, and subsequently modified, in a cache memory, another IP requesting access to the same data item must be prevented from using the older copy of the data item stored either in main memory or the requesting IP's cache. This is referred to as maintaining cache coherency. Maintaining cache coherency becomes more difficult as more caches are added to the system since more copies of a single data item may have to be tracked.
Many methods exist to maintain cache coherency. Some earlier systems achieve coherency by implementing memory locks. That is, if an updated copy of data exists within a local cache or buffer memory, other processors are prohibited from obtaining a copy of the data from main memory until the updated copy is returned to main memory, thereby releasing the lock. For complex systems, the additional hardware and/or operating time required for setting and releasing the locks within main memory cannot be justified. Furthermore, reliance on such locks directly prohibits certain types of applications such as parallel processing.
Another method of maintaining cache coherency is shown in U.S. Pat. No. 4,843,542 issued to Dashiell et al., and in U.S. Pat. No. 4,755,930 issued to Wilson, Jr. et al. These patents discuss a system wherein each processor has a local cache coupled to a shared memory through a common memory bus. Each processor is responsible for monitoring, or “snooping”, the common bus to maintain currency of its own cache data. These snooping protocols increase processor overhead, and are unworkable in hierarchical memory configurations that do not have a common bus structure. A similar snooping protocol is shown in U.S. Pat. No. 5,025,365 to Mathur et al., which teaches a snooping protocol that seeks to minimize snooping overhead by invalidating data within the local caches at times when other types of cache operations are not occurring. However, the Mathur system can not be implemented in memory systems that do not have a common bus structure.
Another method of maintaining cache coherency is shown in U.S. Pat. No. 5,423,016 to Tsuchiya assigned to the assignee of the current invention. The method described in this patent involves providing a memory structure called a “duplicate tag” that is associated with each cache memory. Each duplicate tag records which data items are stored within the associated cache. When a data item is modified by a processor, an invalidation request is routed to all of the other duplicate tags in the system. The duplicate tags are searched for the address of the referenced data item. If found, the data item is marked as invalid in the other caches. Such an approach is impractical for distributed systems having many caches interconnected in a hierarchical fashion because the time required to route the invalidation requests poses an undue overhead.
For distributed systems having hierarchical memory structures, a directory-based coherency system becomes more practical. Directory-based coherency systems utilize a centralized directory to record the location and the status of data as it exists throughout the system. For example, the directory records which caches have a copy of the data, and further records if any of the caches have an updated copy of the data. When a cache makes a request to main memory for a data item, the central directory is consulted to determine where the most recent copy of that data item resides. Based on this information, the most recent copy of the data is retrieved so that it may be provided to the requesting cache. The central directory is then updated to reflect the new status for that unit of memory. A novel directory-based cache coherency system for use with multiple Instruction Processors coupled to a hierarchical cache structure is described in the co-pending application entitled “Directory-Based Cache Coherency System Supporting Multiple Instruction Processor and Input/Output Caches” referenced above and which is incorporated herein by reference in its entirety.
The use of the afore-mentioned directory-based cache coherency system provides an efficient mechanism for sharing data between multiple processors that are coupled to a distributed, hierarchical memory structure. Using such a system, the memory structure may be incrementally expanded to include any multiple levels of cache memory while still maintaining the coherency of the shared data. As the number of levels of hierarchy in the memory system is increased, however, some efficiency is lost when data requested by one cache memory in the system must be retrieved from another cache.
As an example of performance degradation associated with memory requests in a hierarchical cache memory system, consider a system having a main memory coupled to three hierarchical levels of cache memory. In the exemplary system, multiple third-level caches are coupled to the main memory, multiple second-level caches are coupled to each third-level cache, and at least one first-level cache is coupled to each second-level cache. This exemplary system includes a non-inclusive caching scheme. This means that all data stored in a first-level cache is not necessarily stored in the interconnected second-level cache, and all data stored in a second-level cache is not necessarily stored in the coupled third-level cache.
Within the above-described system, one or more processors are respectively coupled to make memory requests to an associated first-level cache. Requests for data items not resident in the first-level cache are forwarded on to the intercoupled second-level, and in some cases, the third-level caches. If neither of the intercoupled second or third level caches stores the requested data, the request is forwarded to main memory.
Assume that in the current example, a processor makes a request to the intercoupled first-level cache for a read-only copy of specified data. Assume further that the requested data is not stored in this first-level cache. However, another first-level cache within the system stores a read-only copy of the data. Since the copy of the data is read-only, the request can be completed without involving the other first-level cache. That is, the request may be processed by one of the interconnected second or third-level

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and method for providing the speculative return of... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System and method for providing the speculative return of..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for providing the speculative return of... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2841670

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.