System and method for performing a speculative cache fill

Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S124000, C711S119000, C711S204000, C711S213000

Reexamination Certificate

active

06775749

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention is related to the field of computer systems and, more particularly, to performing a speculative cache fill in a computer system.
2. Description of the Related Art
Since main system memory is typically designed for density rather than speed, microprocessor designers have added caches to their designs to reduce the microprocessor's need to directly access main memory. A cache is a small memory that is more quickly accessible than the main memory. A processor may have a number of different levels of caches. For example, a processor may have a “level one” (L1) cache and a “level two” (L2) cache. These caches tend to be integrated on the same substrate as the microprocessor. Caches are typically constructed of fast memory cells such as static random access memories (SRAMs) which have faster access times and bandwidth than the memories used for the main system memory (typically dynamic random access memories (DRAMs) or synchronous dynamic random access memories (SDRAMs)). The faster SRAMs are not typically used for main system memory because of their low density and high cost.
Many other types of caches may also be present in computer systems. For example, the main system memory may act as a cache for the system's slower direct access storage devices (e.g., hard disk drives). Other devices, such as hard drives, may also include internal caches. For example, hard drives may cache recently accessed or written data in order to improve their read performance. Generally, having a cache allows a device to retrieve data from the cache more quickly than if the device had to access a larger, slower memory to retrieve the data.
When a microprocessor needs data from memory, it typically first checks its L1 cache to see if the required data has been cached. If the data is not present in the L1 cache, the L2 cache is checked (if the processor has an L2 cache). If the L2 cache is storing the data, it provides the data to the microprocessor (typically at much higher rate than the main system memory is capable of). If the data is not cached in the L1 or L2 caches (referred to as a “cache miss”), the data is read from main system memory or some type of mass storage device (e.g., a hard disk drive). Relative to accessing the data from the L1 cache, accesses to memory take many more clock cycles. Similarly, if the data is not in the main system memory, accessing the data from a mass storage device takes even more cycles.
One problem that arises due to caching is that, depending on the way in which updated data in the cache is presented to the memory, a copy of a particular line of data in a cache may not be the same as the copy of that line that is currently in system memory. For example, many caches use a write-back policy to update the copy of data in system memory. Write-back systems increase write efficiency because an updated copy of the cache line is not written back to system memory until the line is evicted from the cache. However, from the time the line is updated in the cache until the time the line is written back to system memory, the cache copy may differ from the memory copy (i.e., the memory has a “stale” copy of that data). As a result, accesses to system memory may be controlled so that other devices in the computer system do not access the stale copy of the data in the system memory. Generally, this problem is one of cache coherence, or ensuring that each device in the computer system accesses the correct (i.e., most recently updated) copy of a particular item of data, regardless of which device is requesting the data or where the data is actually stored. In single processor systems, maintaining cache coherency usually involves restricting I/O devices' access to system memory and/or restricting which portions of system memory may be cached.
In multiprocessor systems, maintaining cache coherency may be a significant problem because the different processors may frequently attempt to access the same data. Additionally, it is desirable for all of the processors to be able to cache the data they operate on. Thus, each processor may have its own L1 and/or L2 cache, but the system memory may be shared between multiple processors. In such a system, one processor may update a copy of a particular memory location in its cache. If a write-back cache policy is being used, the system memory's copy of the modified data may no longer be consistent with the updated copy in the first processor's cache. If a second processor reads the unmodified data from the system memory, unaware of the first processor's updated copy, memory corruption may result. In order to prevent this, whenever one processor needs to perform a cache fill, it may check to make sure none of the other processors in the system have a more recent copy of the requested data in their caches.
There are several different methods of detecting whether other processors have copies of a particular item of data in their caches. One method is called snooping. Snooping is typically used in systems where all processors that share memory are also coupled to the same bus. Each processor or cache controller monitors the bus for transactions involving data that is currently in its cache. If such a transaction is detected, the particular unit of data may be evicted from the cache or updated in the cache. Another method of detecting whether other caches have copies of requested data involves a data-requesting processor sending probe commands to every other processor and/or cache controller in the system. In response to receiving a probe, a processor or cache controller may generate a response indicating whether its cache contains a copy of the requested data.
In some systems, the time required to maintain cache coherency (e.g., the time required to send probes and receive responses) may be significant. The total time taken to perform a cache fill may depend on the latency of both the cache coherency mechanism and that of the memory system. As a result, the time spent maintaining cache coherency may significantly affect performance. Accordingly, one drawback of sharing memory between devices that have caches is that cache fill performance may decrease.
SUMMARY
Various embodiments of methods and systems for performing a speculative cache fill are disclosed. In one embodiment, a computer system includes several caches that are each coupled to receive data from a shared memory. Each cache is controlled by a respective cache controller. A cache coherency mechanism, which in some embodiments may be part of a chipset, is coupled to the cache controllers and the memory. The cache coherency mechanism is configured to receive a cache fill request. In response to receiving the request, the cache coherency mechanism is configured to send a probe to some of the cache controllers (e.g., all of the cache controllers except for the one controlling the cache that is being filled by the cache fill request). Some time after sending the probe, the cache controller is configured to provide a speculative response to the requesting cache. By delaying to send the speculative response until some time after the probes are sent, the cache coherency mechanism may increase the likelihood that responses to the probes will be received in time to validate the speculative response.
The cache coherency mechanism may be configured to provide speculative response if at least one of the cache controllers to whom a probe was sent has not yet responded to the probe. If one of the cache controllers responds to the probe with an indication that its cache has a modified copy of the data, the cache coherency mechanism may be configured to invalidate the speculative response and provide a non-speculative response after obtaining the most recent copy of the data.
The cache coherency mechanism may be configured to validate the speculative response by providing a validation signal to the first cache's cache controller during the speculative response. If fewer than all of the cache controllers h

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and method for performing a speculative cache fill does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System and method for performing a speculative cache fill, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for performing a speculative cache fill will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3298647

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.