Method for selecting active code traces for translation in a...

Data processing: software development – installation – and managem – Software program development tool – Translation of code

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C717S152000, C717S152000, C712S209000, C712S227000

Reexamination Certificate

active

06351844

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates generally to the selection of code regions for caching in caching dynamic translator.
2. Description of the Related Art
Dynamic translators are used for directly executing non-native program binaries on a processor. Dynamic translators operate by translating each word of the non-native code into a corresponding word or words of native code for execution by the processor. Dynamic translators are related to binary translators, just-in-time compilers and runtime translators. Translation is generally performed at the time the non-native code word is executed. However, there is a significant level of overhead in performing the translation from non-native to native code. Similar methods can be applied to the optimization of native binary code where optimized native code is generated and executed in place of non-optimized native code.
In order to improve performance, native code translations of frequently executed regions are typically kept in a translated code cache. Subsequent execution references to the non-native code words of these translated regions then execute in the corresponding region of the translated code cache, thus avoiding the overhead of emulation.
FIG. 1
illustrates an architecture for an embodiment of a caching dynamic translator
10
. A memory image
22
of the non-native code is stored in memory
20
. During execution, each word of the non-native code
22
is read out of memory
20
by an interpreter
30
which emulates the non-native binary code on a native processor for execution. Alternatively, the interpreter may read several words from memory, translate them into native code, and then output the translated code to a native processor for execution. In
FIGS. 1 and 2
, the arrow indicating “native binary” going “to native processor” represents a combination of the execution of the interpreter program with execution of translated code.
The translated native binary code is also typically stored into a translated code cache
50
. When control section
32
of interpreter
30
detects a cache hit for an instruction in translated code cache
50
, the translated version of the interpreted native binary is output from the cache for execution by the native processor.
A region selector
40
is often included which manages the content of the code cache
50
and determines which segments of translated code remain in the code cache
50
. Subsequent references to the non-native code image
22
will execute the corresponding native code in code cache
50
provided that the corresponding native code has not been replaced.
The region selector
40
typically receives runtime profile data from the interpreter
30
which the region selector uses in selecting regions of translated code that are maintained in the translated code cache. Judicious region selection can improve the hit rate in the translated code cache, but at the cost of higher overhead. The tradeoff between hit rate and selection overhead is a critical part of dynamic translator design.
Existing implementations of dynamic translators use either runtime profile data, such as statistical PC sampling or branch profiling, or call invocation counting in order to identify frequently executed regions of the non-native code. The problem with such methods is that it is hard to trigger an action based on execution rate (i.e. how often a region is executed within a certain time interval); it can only be triggered based on execution count (how many times a region has executed thus far). Another problem is that it is difficult to dynamically adjust the degree of profiling done on different program regions because heavy profiling of a very hot region can hurt performance due to the overhead associated with profiling, whereas it may be inconsequential on a cold region.
For example, the SELF system (described by U. Holzle in “
Adaptive optimization for SELF: Reconciling High Performance with Exploratory Programming”,
PhD Thesis, Stanford University Dept. of Computer Science, August 1994) generates unoptimized native code for a procedure upon first invocation of the procedure, with the procedure prologue containing instrumentation to count the number of invocations. If a counter exceeds a threshold, the corresponding routine is flagged as hot (i.e. it has reached an activity threshold) and, in the case of the SELF system, the hot routine is dynamically re-optimized along with other routines in the call chain.
In the SELF system, an exponential decay technique for region selection is used, wherein the system is periodically interrupted and all the counters corresponding to the cached routines are halved. This attempts to convert the counters into measures of invocation rates rather than invocation counts.
The runtime profile of a program is used in dynamic translators to focus analysis on those parts of the executing program where greater performance benefit is likely. A runtime profile is a collection of information indicating the control flow path of a program, i.e. which instructions executed and where branches in the execution took place. Program profiling typically counts the occurrences of an event during a program's execution. The measured event is typically a local portion of a program, such as a routine, line of code or branch. Profile information for a program can consist of simple execution counts or more elaborate metrics gathered from hardware counters within the computer executing the program.
One conventional approach to profiling is to instrument the program code by adding profiling probes to the code. Profiling probes are additional instructions which are used to log the execution of a basic block of code containing the probe.
Instrumentation based methods for gathering profile data tend to be complex and time consuming. Instrumentation of the code can result in a code size explosion due to the added instructions. The additional probe instructions also slow execution of the code and a profiled, or instrumented, version of a program can run substantially slower than the original version. Thus, profiling can represent a significant level of overhead in the execution of a program.
Therefore, the need remains for a method of selecting regions for dynamic translation into a code cache which has limited overhead and increases the time spent executing from the code cache.
SUMMARY OF THE INVENTION
It is, therefore, an object of the invention to provide a method for selecting active code segments in an executing program having low overhead.
Another object of the invention is to enable dynamic optimization of the code while the code is executing.
An embodiment of a method for selecting active code segments in an executing program, according to the present invention, involves creating a branch history entry for a series of executed code segments, wherein each branch history entry includes a start address and branch history value of one of the segments, storing each branch history entry in a trace buffer, and incrementing a counter corresponding to the start address for each branch history entry in the trace buffer responsive to a selection processing signal. The method then calls for identifying as a hot trace each branch history entry having a start address value with a corresponding counter value which exceeds a threshold, translating the program code segment corresponding to each hot trace into a translated code segment, and storing the translated code segment into a translated code cache.
An embodiment of a dynamic translator for executing a non-native program, the translator, according to the present invention, includes an interpreter configured to receive non-native code words from a non-native code image of the non-native program and interpret the non-native code words by executing native code words. The interpreter is also configured to generate branch history data including a start address and a branch history value for each of a series of traces during execution of the non-native program. The interpreter includes a control section conf

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method for selecting active code traces for translation in a... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method for selecting active code traces for translation in a..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for selecting active code traces for translation in a... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2962717

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.