Set-associative cache memory having variable time decay...

Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S133000, C365S049130

Reexamination Certificate

active

06732238

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention is directed to microprocessor architectures. More particularly, the invention is directed to TLBs and cache memories for speeding processor access to main memory in microprocessor systems. Even more particularly, the invention is directed to methods and apparatuses for implementing novel refill policies for multi-way set associative caches and TLBs.
2. Background of the Related Art
Caches and Translation Lookaside Buffers (TLBs) are ubiquitous in microprocessor design. For general information on such microprocessor structures, see J. L. Hennessy and D. A. Patterson,
Computer Architecture: A Quantitive Approach
(1996), Chapter 5.
Generally, the speed at which a microprocessor (e.g. a CPU) operates depends on the rate at which instructions and operands are transferred between memory and the CPU. As shown in
FIG. 1
, a cache
110
is a relatively small random access memory (RAM) used to store a copy of memory data in anticipation of future use by the CPU
120
. Typically, the cache
110
is positioned between the CPU
120
and the main memory
130
as shown in
FIG. 1
, to intercept calls from the CPU
120
to the main memory
130
. When the data is needed, it can quickly be retrieved from the cache
110
, rather than obtaining it from the slow main memory
130
.
A cache may be implemented by one or more RAM integrated circuits. For very high speed caches, the RAM is usually an integral part of the CPU chip. The data stored in a cache can be transferred to the CPU in substantially less time than data stored in main memory.
A translation look-aside buffer (TLB)
140
is a special form of cache that is used to store portions of a page table (which may or may not be stored in main memory
130
). As is known, the page table translates virtual page numbers into physical page numbers. TLB
140
is typically organized to hold only a single entry per tag (each TLB entry comprising, for example, a physical page number, permissions for access, etc.). In contrast, cache
110
is typically organized into a plurality of blocks, wherein each block has a corresponding tag and stores a copy of one or more contiguously addressable bytes of memory data.
In order to access data in the cache
110
, the virtual memory address is broken down into a cache address as shown in FIG.
2
. The portion of the cache address including the most significant bits of the memory address is called the tag
240
, and the portion including the least significant bits is called the cache index
250
. The cache index
250
corresponds to the address of the block storing a copy of the referenced data, and additional bits (i.e. offset
260
) are usually used to address the bytes within a block, if each block has more than one byte of data. The tag
240
is used to uniquely identify blocks having different memory addresses but the same cache index
250
. Therefore, the cache
110
typically includes a data store and a tag store. The data store is used for storing the blocks
270
of data. The tag store, sometimes known as the directory, is used for storing the tags
240
corresponding to each of the blocks
270
of data. Both the data store and the tag store are accessed by the cache index
250
. The output of the data store is a block
270
of data, and the output of the tag store is a tag
240
.
There are different types of caches, ranging from direct-mapped caches, where a block can appear in only one place in the cache
110
, to fully-associative caches where a block can appear in any place in the cache
110
. In between these extremes is another type of cache called a multi-Way set-associative cache wherein two or more concurrently addressable RAMs can cache a plurality of blocks
270
and tags
240
for a single cache index
250
. That is, in a conventional N-Way set-associative cache, the single cache index
250
is used to concurrently access a plurality N of blocks
270
and tags
240
in a set of N RAMs. The number of RAMs in the set indicates the Way number of the cache. For example, if the cache index
250
is used to concurrently address data and tags
240
stored in two RAMs, the cache is a two-Way set-associative cache.
As shown in
FIG. 2
, during the operation of a single-index multi-Way set-associative cache, a memory access by the CPU causes each of the RAMs
1
to N to be examined at the corresponding cache index location. The tag is used to distinguish the cache blocks having the same cache index but different memory addresses. If a tag comparison indicates that the desired data are stored in a cache block of one of the RAMs, that RAM is selected and the desired access is completed. It should be noted that caches are generally indexed with a virtual address and tagged with a physical address.
A multi-Way set-associative cache provides the advantage that there are two or more possible locations for storing data in blocks having the same cache index. This arrangement reduces thrashing due to hot spots in memory and increases the operating speed of the computer system if the hot spots are uniformly distributed over the blocks of RAM.
As further shown in
FIG. 2
, simultaneously with an access to cache
110
, an access to TLB
140
can be made to translate the virtual address into a physical address. It should be noted that, although
FIG. 2
shows the virtual page number comprising the same bits as tag
240
and index
250
combined, that this is not necessary, and in fact the bit ranges for the different fields may be different. It should be further noted that the page offset and the offset
260
may also comprise different bit ranges.
Although not shown in detail in
FIG. 2
, TLBs can also be implemented using a range from direct-mapped to fully associative types of caches. In particular, the TLBs that implement Xtensa MMU from Tensilica, Inc. (see co-pending application Ser. No. 10/213,370; and the Xtensa ISA) are set-associative memories that cache entries from the page table. These caches are implemented with logic synthesis of standard cells and can make use of heterogenous ways (i.e. different ways may have different sizes). As described in the co-pending application, the Xtensa MMU includes a feature called Variable Page Sizes. There are a couple of things that make this happen. First, at configuration time, each way can be configured to support some different page sizes. Hardware is generated to support all of the page sizes configured. At run time, the operating system will program each way with a single page size it is translating at any given time. In one example implementation, a special runtime configuration register is provided that allows each way to be programmed by the operating system to perform translations for a certain page size.
Due to this novel feature, different access patterns happen because either the ways have different numbers of indices, the ways are translating different page sizes, or both. For example, assume there is a way that is four entries and can support 4 kB or 4 MB pages. If it is programmed to translate 4 kB pages, then the index would be VirtAddr[13:12]. If it were programmed to translate 4 MB pages, the index would be VirtAddr[23:22]. Now, assume there are four of these ways. At any given time, some of them may be programmed to translate 4 kB pages, and others may be programmed to translate 4 MB pages.
In case of a cache miss (in either cache
110
and/or TLB
140
), a determination is made to select one of the blocks/entries for replacement. Methods of implementing a replacement strategy for data in a cache are known in cache design. Typically, the replacement of cache entries are done in a least recently used (LRU) manner, in which the least recently used block is replaced. A more flexible strategy is the not most recently used (NMRU), which chooses a block among all those not most recently used for replacement. Blocks may also be selected at random for replacement. Other possible strategies include pseudo-LRU (an approximation of true-LRU that is more easily im

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Set-associative cache memory having variable time decay... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Set-associative cache memory having variable time decay..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Set-associative cache memory having variable time decay... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3245393

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.