User-prioritized cache replacement

Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

06349365

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to microprocessors and more particularly, to cache replacement schemes within microprocessor caches.
2. Description of the Relevant Art
Superscalar microprocessors are capable of attaining performance characteristics which surpass those of conventional scalar processors by allowing the concurrent execution of multiple instructions. As a result of executing multiple instructions per clock cycle, a superscalar processor's performance is heavily impacted by the processor's ability to quickly move instructions from memory to the execution units within the processor that actually execute the instructions. Since main system memory is typically designed for density rather than speed, microprocessor designers have added caches to their designs to reduce the microprocessor's need to directly access main memory. A cache is a small memory that is more quickly accessible than the main memory. Computer systems may have a number of different levels of caches. For example, a computer system may have a “level one” cache that is internal to the microprocessor (i.e., on-chip), and a “level two” cache that is external to the microprocessor. Caches are typically constructed of fast memory cells such as static random access memories (SRAMs) which have faster access times and bandwidth than the memories used for the main system memory (typically dynamic random access memories (DRAMs) or synchronous dynamic random access memories (SDRAMs)). The faster SRAMs are not typically used for main system memory because of their lower density and corresponding higher cost. Other types of caching are also possible. For example, the main system memory may act as a cache for the system's slower direct access storage devices (e.g., hard disk drives).
When microprocessors need data from memory, they typically first check their level one cache to see the if the required data has been cached. If not, the data is requested from memory. If the second level cache is storing the data, it provided the data to the microprocessor (typically at much higher rate than the main system memory is capable of). If the data is not cached in the first or second level caches (referred to as a “cache miss”), the data is read from main system memory or some type of mass storage device (e.g., a hard disk drive). Relative to accessing the data from the level one cache, accesses to memory take many more clock cycles.
Caches typically operate on the principal of locality of reference, which states that the data most recently used (and the data in that locality) is more likely to be accessed than the rest of the data in general. This principle holds because computer software tends to be somewhat linear in execution and typically has loops and branches that cause previously executed code to be re-executed. By storing recently accessed instructions and data in a cache, system performance may be increased because the microprocessor need not wait for the instructions and data to read from main memory.
Microprocessor and computer system architects have taken this principle one step further by using techniques such as branch prediction to proactively store instructions and data in the cache before they are actually needed by the microprocessor. In addition, when an instruction or byte of data is read from memory, additional bytes following the instruction or data are read and cached. Once again, the principal of locality of reference dictates that these instruction and data bytes are more likely to be needed by the processor than the other data or instructions at large.
Since cache size is limited by a number of factors (including die size, power consumption, and cost), care must be taken when loading information into the cache. Once particular area of concern for the designer is when to overwrite or invalidate existing instructions and data in a cache to make room for new instructions and data. A common solution is to track the frequency of accesses and then replace the least recently used instructions or data with new instructions or data. Other solutions include random replacement, and first-in first-out techniques. While these techniques are all effective to a certain extent, none of them are able to take advantage of the underlying structure of the program being executed.
Given the heavy penalty associated with cache misses, a more accurate method for determining which instructions and data should be cached and which instructions and data should be overwritten is needed.
SUMMARY OF THE INVENTION
The problems outlined above may at least in part be solved by a method and apparatus that allows programmers to prioritize instructions and/or data with respect to caching and cache replacement. Advantageously, this may improve performance by reducing cache misses. Furthermore, encoding cache replacement information may also caches of a particular size to be utilized more efficiently, thereby freeing precious die space for other purposes (e.g., more functional/execution units). Depending upon the exact implementation, the method may be applied to both instruction and data caches.
A microprocessor configured to utilize cache priority information is contemplated. In one embodiment, the microprocessor comprises a cache, a predecode unit, and a cache controller. The cache may comprise a plurality of storage locations (referred to as cache lines), each storing a predetermined number of instruction and/or data bytes. The predecode unit is configured to receive/predecode instructions bytes and detect the presence of cache replacement priority prefix bytes or opcodes. This information may then be routed to the cache's control logic (i.e., the cache controller). The cache controller may then store this information along with other predecode information and the actual instruction bytes in the cache. The cache control logic is configured to utilize the stored cache replacement priority information to determine which instruction bytes stored in the cache should be overwritten. Note, as used herein the term “overwritten” may mean flushed, invalidated, or written back to memory, depending upon the type of cache being used. In addition to a microprocessor, a computer system capable of executing code that contains cache replacement information is also contemplated.
Note, as used herein, a “high priority instruction” is an instruction that will likely receive preferable treatment during the cache line replacement process. When a cache is full and new data is read from memory, the cache's control logic must determine where to store the new information (i.e., which of the old cache lines to overwrite). The present invention allows cache replacement priority (CRP) data regarding which bytes in the cache should or should not be overwritten to be stored. For example, if a programmer knows that a particular subroutine will be called repeatedly through a program, then the programmer may indicated to the compiler that the instructions forming the subroutine should be marked as high priority instructions. The compiler may do this in a number of different ways, depending upon the implementation. In one embodiment, special prefix bytes are added to the high priority code. These prefix bytes may be detected by the prefetch/predecode unit of the microprocessor executing the program. The prefetch/predecode unit may then signal the microprocessor's cache that the instructions should be given preferential treatment during the cache line replacement process (i.e., instructions marked low priority or those not marked high priority should be overwritten before the high priority instructions are overwritten).
In some embodiments, the compiler may also allow the programmer to indicate one or more points in the program after which the high priority instructions shall be converted to normal or low priority instructions. For example, in a computer graphics program, there may be one portion of the program that is high priority at the beginning of execution, e.g., a subroutine that

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

User-prioritized cache replacement does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with User-prioritized cache replacement, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and User-prioritized cache replacement will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2960413

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.