Set-associative cache-management method with parallel read...

Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S169000

Reexamination Certificate

active

06385700

ABSTRACT:

BACKGROUND OF THE INVENTION
The present. invention relates to computers and, more particularly, to a method for managing a set-associative cache. A major objective of the present invention is to reduce average cache access times.
Much of modern progress is associated with the increasing prevalence. of computers. In a conventional computer architecture, a data processor manipulates data in accordance with program instructions. The data and instructions are read from, written to, and stored in the computer's “main” memory. Typically, main memory is in the form of random-access memory (RAM) modules.
A processor access main memory by asserting an address associated with a memory location. For example, a 32-bit address can select any one of up to 2
32
address locations. In this example, each location holds eight bits, i.e., one “byte” of data, arranged in “words” of four bytes each, arranged in “lines” of four words each. In other words, there are 2
30
word locations, and 2
28
line locations.
Accessing main memory tends to be much faster than accessing disk and tape-based memories; nonetheless, main memory accesses can leave a processor idling while it waits for a request to be fulfilled. To minimize such latencies, cache memories intercept processor requests to main memory and attempt to fulfill them faster than main memory can.
To fulfill processor requests to main memory, caches must contain copies of data stored in main memory. In part to optimize access times, a cache is typically much less capacious than main memory. Accordingly, it can represent only a small fraction of main memory contents at any given time. To optimize the performance gain achievable by a cache, this small fraction must be carefully selected.
In the event of a cache “miss”, when a request cannot be fulfilled by a cache, the cache fetches the entire line of main memory including the memory location requested by the processor. This entire line is stored in the cache since a processor is relatively likely to request data from locations that are near a location from which a recently made request was made. Where the line is stored depends on the type of cache.
A fully associative cache can store the fetched line in any cache storage location. The fully associative cache stores not only the data in the line, but also stores the line-address (the most-significant 28 bits) of the address as a “tag” in association with the line of data. The next time the processor asserts a main-memory address, the cache compares that address with all the tags stored in the cache. If a match is found, the requested data is provided to the processor from the cache.
There are two problems with a fully associative cache. The first is that the tags consume a relatively large percentage of cache capacity, which is limited to ensure high-speed accesses. The second problem is that every cache memory location must be checked to determine whether there is a tag that matches a requested address. Such an exhaustive match checking process can be time-consuming, making it hard to achieve the access speed gains desired of a cache.
In a direct-mapped cache, each cache storage location is given an index which, for example, might correspond to the least-significant line-address bits. For example, in the 32-bit address example, a six-bit index might correspond to address bits
23
-
28
. A restriction is imposed that a line fetched from main memory can only be stored at the cache location with an index that matches bits
23
-
28
of the requested address. Since those six bits are known, only the first 22 bits are needed as a tag. Thus, less cache capacity is devoted to tags. Also, when the processor asserts an address, only one. cache location (the one with an index matching the corresponding bits of the address asserted by the processor) needs to be examined to determine whether or not the request can be fulfilled from the cache.
The problem with a direct-mapped cache is that when a line is stored in the cache, it must overwrite any data stored at that location. If the data overwritten is data that would be likely to be called in the near term, this overwriting diminishes the effectiveness of the cache. A direct-mapped cache does not provide the flexibility to choose which data is to be overwritten to make room for new data.
In a set-associative cache, the memory is divided into two or more direct-mapped sets. Each index is associated with one memory location in each set. Thus, in a four-way set associative cache, there are four cache locations with the same index, and thus, four choices of locations to overwrite when a line is stored in the cache. This allows more optimal replacement strategies than are available for direct-mapped caches. Still, the number of locations that must be checked, e.g., one per set, to determine whether a requested location is represented in the cache is quite limited, and the number of bits that need to be compared is reduced by the length of the index. Thus, set-associative caches provide an attractive compromise that combines some of the replacement strategy flexibility of a fully associative cache with much of the speed advantage of a direct-mapped cache.
When a set-associative cache receives an address from a processor, it determines the relevant cache locations by selecting the cache locations with an index that matches the corresponding address bits. The tags stored at the cache locations corresponding to that index are checked for a match. If there is a match, the. least-significant address bits are checked for the word location (or fraction thereof) within the line stored at the match location. The contents at that location are then accessed and transmitted to the processor.
In the case of a read operation, the cache access can be hastened by starting the data access before a match is determined. While checking the relevant tags for a match, the appropriate data locations within each set having the appropriate index are accessed. By the time a match is determined, data from all four sets are ready for transmission. The match is used, e.g., as the control input to a multiplexer, to select the data actually transmitted. If there is no match, none of the data is transmitted.
The read operation is much faster since the data is accessed at the same time as the match operation is conducted rather than after. For example, a parallel “tag-and-data” read operation might consume only one memory cycle, while a serial “tag-then-data” read operation might require two cycles. Alternatively, if the serial read operation consumed only one cycle, the parallel read operation would permit a shorter cycle, allowing for more processor operations per unit of time.
The gains of the parallel tag-and-data reads are not without some cost. The data accesses that do not provide the requested data consume additional power that can tax power sources and dissipate extra heat. The heat can fatigue, impair, and damage the incorporating integrated circuit and proximal components. Accordingly, larger batteries or power supplies and more substantial heat removal provisions may be required.
Nonetheless, such provisions are generally well worth the speed advantages of the parallel tag-and-data read accesses. A comparable approach to hastening write operations is desired. Unfortunately, the parallel tag-and-data approach is not applied to write operations since parallel data access would involve overwriting data that should be preserved. Accordingly, in the context of a system using parallel reads, the write operations have become a more salient limit to performance. What is needed is a cache management method in which write operation times more closely match those achieved using parallel tag-and-data reads.
SUMMARY OF THE INVENTION
The present invention provides a cache-management method using pipelined cache writes in conjunction with parallel tag-and-data reads. Thus, the cache can accept a second write request while writing the data from the previous write request. While each write operation takes place over two cycles, a series of write

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Set-associative cache-management method with parallel read... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Set-associative cache-management method with parallel read..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Set-associative cache-management method with parallel read... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2861262

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.