Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories
Reexamination Certificate
1998-01-05
2001-11-20
Gossage, Glenn (Department: 2187)
Electrical computers and digital processing systems: memory
Storage accessing and control
Hierarchical memories
C711S128000, C711S141000, C711S144000, C710S052000
Reexamination Certificate
active
06321297
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to the field of processors and, more particularly, to a technique for utilizing cache memory.
2. Background of the Related Art
The use of a cache or caches with a processor (whether integrated within the processor chip or external to it) is well known in the computer art. A primary purpose of using caches is to enhance processor performance by reducing data access time. It is generally understood that memory devices closer to the processor operate faster than memory devices farther away on the data path from the processor. However, there is a cost trade-off in utilizing faster memory devices. The faster the data access, the higher the cost to store a bit of data. Accordingly, a cache memory tends to be much smaller in storage capacity than main memory, but is faster in accessing the data.
Current generation high performance computer systems will utilize multiple caches, typically arranged in a hierarchical arrangement of cache levels. A processor of a computer system maintains cache coherency by updating all the caches simultaneously or by updating the different cache levels at various times. For example, in a write-through cache system, a write operation simultaneously updates the cache and the main memory or the next level cache in the hierarchy. If all of the caches are write-through caches, then all of the caches and the main memory can be updated simultaneously. In a write-back cache system, a write operation updates only the closest cache. The other cache level(s) and the main memory can be updated at a later time, such as when a cache line is evicted (victimized). Accordingly, there may not be data consistency between the main memory and the cache in a write-back cache.
The use of write-through and write-back caches are known in the art, along with the various cache line allocation and de-allocation schemes for accessing the cache memories. It is also understood that the caches can be inclusive, partially inclusive or exclusive, as pertaining to data storage in the cache hierarchy. A cache hierarchy is inclusive if a cache at a given level is a subset of a cache at a higher level of the hierarchy. A data request by a processor is typically satisfied by the closest cache level that contains the data. Lower in the hierarchy is defined as those levels closer to the processor. A cache is exclusive if cached data in one level does not exist in any other level. A partially inclusive cache implies that the data at a given cache level is not a full subset of a higher cache level. Generally in practice, most cache systems implement a partially inclusive cache structure.
One notable aspect of cache memory is the use of address tags to identify cache lines present in a cache. An address tag is a subset of the actual address. The number of bits in the tag is determined by the number of sets in the cache and also the cache line size. A cache line includes the tag, information (or data) and state (or status) bit(s), which provide certain state (or status) information pertaining to the cached line. For example, state bits are used to identify if the cache line is dirty (been modified), is shared by other resources, is invalid or is exclusive to one resource.
Since each cache line stores multiple bytes of data, the tag corresponds to the beginning address of the group of data in memory which are now stored in the cache. Accordingly, the cached data is a replication of the data stored in the main memory.
Whenever a read instruction requiring data retrieval from memory is executed, the processor generates an address for accessing that memory location to retrieve the data. This read address is then presented to the cache. A particular set is accessed and tags present in the different ways in the set are compared with the read address. If the compare operation is successful, data is provided to the processor. If the compare operation is unsuccessful, data is retrieved from external memory and loaded into the cache and also forwarded to the processor. If the same data is needed again, then data is retrieved from the cache instead of the main memory.
Likewise, when a write is executed by the processor, the processor will need to update the cached data, either prior to or simultaneously with the updating of the main memory. In a write operation from the processor, a cache is accessed to determine if the tag for that address is present in the cache, so that the cached data can be loaded into a write buffer. If data had not been cached, then the data is retrieved from the main memory and loaded into the cache(s) and the write buffer, similar to a read operation. Next, the data in the write buffer entry is updated with the store data. Subsequently, the modified data in the write buffer is written to the appropriate location(s) in the cache(s) and/or the main memory. It should be noted that the cache cannot be updated directly with the store data due to significant implementation difficulties.
In order to determine the presence of a particular address tag in a cache corresponding to the address associated with the data which is to be written, tag compares are performed at the cache levels. A tag comparison determines if there is a “hit” or a “miss” at a given cache level. A subtle but important point to be observed is that one tag compare operation is performed during a read and two tag compare operations are performed during a write operation..
In a typical tag comparison operation, some amount of time is needed to perform the comparison. Typically, at least a full clock cycle is required to read the tags and compare them to the address to determine if a cache line associated with the address is present in the cache. A mechanism that avoids the need for this tag compare operation when updating the cache line would save significant amount of time, thereby improving processor performance. The present invention provides for such a scheme in which tag compares during a write operation to modify an existing cache line is avoided, in order to improve processor performance. That is, the practice of the present invention reduces the required tag comparisons during a write to one tag compare operation, instead of two.
SUMMARY OF THE INVENTION
The present invention describes a technique for avoiding tag compares when writing to a cache. Instead of comparing the address to tags in a cache to determine if a hit will result in the cache during a cache update, the present invention provides location/way information of the cache line to be supplied to the processor's write buffer so that the cache line can be updated directly without performing a tag compare. The avoidance of the tag compare operation during a cache write phase saves a clock cycle so that overall processor performance is improved.
When data is originally loaded from the main memory and cached into the various caches of the cache hierarchy, location information as to which “way” the particular data has been cached is noted and transmitted down the hierarchy and linked in the various caches. The location information is transferred to the write buffer when data is loaded into the write buffer from the cache for modification. Now the cache line can be updated directly with the write buffer contents, without performing a tag compare, since the location/way information of the cache line is presently stored in the write buffer.
In the preferred embodiment, valid bits are used in the write buffer to indicate that the location/way information is incorrect if the original cache line had been victimized. This ensures that the location information in the write buffer does not update the cache location, if the location no longer contains the original entry. The present invention also has the various cache levels independently maintain location information of caches higher in the hierarchy, so that information is available at each of the cache levels independent of the write buffer. This allows cache updates to be performed at a later time from the buffer update.
REF
Bachand Derek T.
Chung Chih-Hung
Shamanna Gururaj
Blakely , Sokoloff, Taylor & Zafman LLP
Gossage Glenn
Intel Corporation
LandOfFree
Avoiding tag compares during writes in multi-level cache... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Avoiding tag compares during writes in multi-level cache..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Avoiding tag compares during writes in multi-level cache... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2607497