Preemptive replacement strategy for a caching dynamic...

Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S135000, C711S125000, C712S233000

Reexamination Certificate

active

06237065

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to control of computer caches, especially instruction caches in dynamic translators.
BACKGROUND
Dynamic translators translate one sequence of instructions into another sequence of instructions which is executed. The second sequence of instructions are ‘native’ instructions—they can be executed directly by the machine on which the translator is running (this ‘machine’ may be hardware or this machine may be defined by software that is running on yet another machine with its own architecture). A dynamic translator can be designed to execute instructions for one machine architecture (i.e., one instruction set) on a machine of a different architecture (i.e., with a different instruction set). Alternatively, a dynamic translator can take instructions that are native to the machine on which the dynamic translator is running and operate on that instruction stream to produce an optimized instruction stream. Also, a dynamic translator can include both of these functions (translation from one architecture to another, and optimization).
Caching dynamic translators attempt to identify program hot spots (frequently executed portions of the program, such as certain loops) at runtime and use a code cache to store translations of those frequently executed portions. Subsequent execution of those portions can use the cached translations, thereby reducing the overhead of executing those portions of the program.
Code cache replacement is required when the code cache does not have enough space for the placement of a translation (termed an allocation request). A conventional replacement policy has been to evict (also termed flush or replace) translations in the entire code cache or a large sub-portion of the code cache and then allocate new translations. A large number of translations are flushed because the cost of replacement at a finer granularity, i.e., one translation at a time, is prohibitive. The implicit logic behind such a replacement strategy is reactive: replace only in reaction to a space shortage.
A problem with such an approach is the timing of the replacement. For example, the translator could be in the middle of constructing a working set when the code cache runs out of space and is forced to flush. The translator will then recommence creating the working set and will probably have to recreate some of the just-evicted translations.
Because a code cache flush has been viewed as a costly operation, code caches are often sized large enough so that replacement is not needed. In contrast, well-timed preemptive replacement can considerably reduce the cost of code cache flushing, reduce memory requirements, and have other desirable side effects.
SUMMARY OF THE INVENTION
The present inventive cache control mechanism operates according to a preemptive replacement strategy.
The cache control mechanism works with cache fill logic that determines what new entries are to be added to the cache and with logic that determines the accuracy of the system's branch predictions.
A parameter that that is indicative of a change in the working set of the program being translated (such as translation rate) or a change in the paths being exercised within the current working set (such as branch behavior) is measured or estimated. For example, to use translation rate, the extent of execution in the cache and the rate of addition to the cache can be measured; to use branch behavior, the rate of branch prediction misses can be measured.
Low-high and high-low transitions in the measured parameter are detected. For example the parameter can be compared to a threshold value: exceeding the threshold indicates a low-high transition; falling below the threshold indicates a high-low transition.
Detection of such transitions is used to trigger preemptive cache replacement (which may flush the entire cache or only a portion of the cache), because the transitions are presumed to indicate a change in the working set of the program or a change in the paths being exercised within the working set. This preemptive flushing then sets the stage for the cache to be subsequently repopulated with translations.
This replacement is preemptive—it is performed independently of an out-of-space condition and, depending on the code cache size, can eliminate reactive replacement altogether.
In the case of a change in the working set, flushing preemptively can reduce extent to which it is necessary to restore previously cached and flushed items, as compared with reactive control mechanisms that are susceptible to wasteful ill-timed replacement: a reactive system may run out of space late in the formation of a working set and flush and need to restore the portion of the working set that had been stored at the time of the out-of-space condition. In the case of a change in paths within the current working set, flushing preemptively can enable an increase in performance as it gives the dynamic translator the opportunity to reform translations along more profitable execution paths.
In the situation when the code cache is sized large enough so that reactive flushing is never required, preemptive flushing can reduce the memory requirements. A code cache with preemptive flushing need only be large enough to hold the largest working set; in contrast, in the absence of preemptive flushing, to avoid reactive flushing, the code cache has to be sized large enough to hold the sum of all of the working set sizes.
There are beneficial side effects of code cache flushing which are enhanced by well-timed flushes. These effects are present when any sort of flushing is used but can be more beneficial with the preemptive technique:
1. If any translations from the prior working set are also a part of the current working set, these translations are recreated and reallocated in the code cache. They will be located physically closer to the other translations in the current working set. This can reduce instruction cache conflict misses and virtual memory pressure. Also, since the working set's translations are located closer together, the number of instructions executed to transition between translations is reduced; this is because more translations can be reached from a single PC-relative branch instruction instead of an instruction sequence to load an address into a register and branch through the register.
2. If the translator performs any sort of trace formation, i.e., uses a likely sequence of basic blocks for better code layout within the code cache, a recreated translation may be formed using a better sequence of blocks than was used earlier. Whether a better sequence of blocks is used is not dictated by the replacement strategy. However, preemptive replacement enables recreation at phase shift points, which intuitively is a good time to reform a translation.
These positive effects apply when either a change in working set is detected or a change in the paths being executed within the current working set is detected.


REFERENCES:
patent: 5848433 (1998-12-01), Tran et al.
patent: 6108775 (2000-08-01), Shiell et al.
patent: 6115791 (2000-09-01), Collins et al.
patent: 6148378 (2000-11-01), Bordaz et al.
patent: 6164841 (2000-12-01), Mattson, Jr. et al.
patent: 6167473 (2000-12-01), Kho
A high performance multi-sturctured file system design, Keith Muller and Jospeh Pasqulae, ACM 0-89791-447-3/91/0009/0056, 1991.*
“The Java Hotspot Performance Engine Architecture,” A White Paper About Sun's Second Generation Performance Technology, Apr., 1999, [retrieved from the Internet on Jun. 06-09, 1999], [http://java.sun,.com/products/hotspot/whitepaper.html], Sun Microsy stems, Inc.
Witchel, Emmett. et al., “Embra: Fast and Flexible Machine Simulation,” Sigmetrics 1996.
Cmelik, Robert F., et al., “Shade: A Fast Instruction-Set Simulator for Execution Profiling,” Technical Report UWCSE Jun. 06, 1993, University of Washington and Sun Microsystems, Inc., 1993.
Cmelik, Robert, F., et al., “Shade: A Fast Instruction-Set Simulator For Execution Profiling,” ACM Sigmetrics Conference on Measur

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Preemptive replacement strategy for a caching dynamic... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Preemptive replacement strategy for a caching dynamic..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Preemptive replacement strategy for a caching dynamic... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2534722

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.