Dynamic code motion optimization and path tracing

Data processing: software development – installation – and managem – Software program development tool – Translation of code

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C717S135000, C717S159000, C717S161000, C712S214000, C712S219000, C712S233000

Reexamination Certificate

active

06487715

ABSTRACT:

BACKGROUND OF THE INVENTION
This invention relates generally to computer software. More specifically, it relates to a method of scheduling instructions for efficient execution on a particular processor.
Generally, computer programmers write computer code in high-level programming languages to simplify various programming tasks. Compilers translate the high-level programs into a sequence of machine-readable instructions. The machine-readable instructions are collectively known as an instruction trace. The instruction trace is typically directed toward a particular processor. In the past, compilers generated the instructions for the instruction trace in the same order that the programmer specified them in the high-level program.
To improve the speed and efficiency of the processors, some modern processors have multiple pipelined execution units. Each pipelined execution unit has one or more stages, each stage performing a specific function that can be completed in a single clock cycle. The pipelined execution unit receives instructions at a first stage (i.e., stage one) and the instruction passes from stage one through each stage of the pipeline. At the end of the pipeline, execution of the instruction is complete. By this method, the efficiency of the processor is increased, because an instruction can be fed into the pipelined execution unit on each cycle, rather than waiting until the previous instruction is complete.
Pipelining is most efficient when the pipeline is kept full. If execution of an instruction is not begun on a particular clock cycle, the execution unit stalls. When an execution unit stalls, the efficiency of the processor goes down, since the pipelined execution unit has resources that are available, but not being used.
Execution unit stalls sometimes occur because of data dependancies. That is, an instruction may be dependent on the results of an instruction that has not yet completed. Modern compilers attempt to reduce execution unit stalls by executing instructions out of sequence. That is, instructions that are ready to be executed are placed in front of instructions that are not yet ready.
Another way that processor performance is increased is by speculative execution. Sometimes, the order of execution is not known until runtime. For example, many branch instructions are dependent on the results of previous calculations. The hardware makes predictions on how the branch instruction will be resolved and executes instructions speculatively based on the prediction. If the prediction was correct, the processor is ahead of where it would have been had it waited for the branch to be resolved. If it is not correct, then the system reverts back to where it would have been without the speculative execution.
Code Motion (also referred to as trace rescheduling) is one method used in optimizing programs for execution. A compiler reorders the instructions to decrease execution unit stalls. However, a limitation of currently available systems in executing instructions out-of-order is the compiler has limited knowledge of the effect of moving instructions ahead in the sequence. Sometimes, executing instruction speculatively is counterproductive since they cause additional overhead. For example, if instructions are moved ahead of a branch instruction and executed, and the prediction turns out to be wrong, the result is that unnecessary work was done and must be undone.
As users put more and more demands on gaining the most efficient use of their processors, it is important to find ways of compiling software for efficient execution by avoiding pipeline stalls. Consequently, there is a need for new and better ways of compiling instructions to a processor to allow for efficient operation.
SUMMARY OF THE INVENTION
The present invention provides a method of improving compiler use of code motion. The method uses a superscaler processor simulator to reorder instructions according to criteria established by the user. It generates statistics showing the effectiveness of particular reordering criteria. A user or compiler may use the statistics to determine the best reordering technique for a particular processor and software.
The method simulates a processor running a program and determines which instructions cause the processor to stall due to unavailability of resources or operands. It moves up (“hoists”) execution of other instructions that are not stalled, so that they may begin execution during the processor stall. Barrier instructions are determined above which the instructions are not hoisted. Barrier instructions include branch instructions, store instruction (if load past store is disallowed), and instructions which will cause the number of registers needed to exceed a predetermined number. By not hoisting instructions above the barrier instructions, the method finds an efficient ordering of the instructions.
To easily correlate the reordered instruction trace to the source code, paths are identified in a unique and easily identifiable way. The paths are ranked according to different criteria such as the number of hoisted instructions or the number of path encounters. This produces useful examples of how paths can and should be optimized by a compilers code generator.


REFERENCES:
patent: 5119498 (1992-06-01), King
patent: 5133072 (1992-07-01), Buzbee
patent: 5450588 (1995-09-01), Hoxey
patent: 5712791 (1998-01-01), Lauterbach
patent: 5764942 (1998-06-01), Kahle et al.
patent: 5857097 (1999-01-01), Henzinger et al.
patent: 5884061 (1999-03-01), Hesson et al.
patent: 5933622 (1999-08-01), Buzbee et al.
patent: 5999736 (1999-12-01), Gupta et al.
patent: 6026240 (2000-02-01), Subramanian
patent: 6044221 (2000-03-01), Gupta et al.
patent: 6247115 (2001-06-01), Janik et al.
patent: 6263489 (2001-07-01), Olsen et al.
patent: 0 442 623 (1991-08-01), None
patent: 0 793 172 (1997-09-01), None
patent: 0 810 523 (1997-12-01), None
Luk & Mowry, “Cooperative Prefetching: Compiler and Hardware Support for Effective Instruction Prefetching in Modern Processors,” Proceedings of the 31st annual ACM/IEEE Int'l Symbosium on Microarchitecture, Dallas, Texas USA, 1998, pp. 182-194.*
Rajiv Gupta, “Code Optimization as a Side Effect of Instruction Scheduling,” IEEE 1997, pp. 370-377.*
Kennedy & Roth, “Context Optimization for SIMD Execution,” IEEE Aug. 1994, pp. 445-453.*
T. Ball et al., “Efficient Path Profiling,”Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture—MICRO-29, Dec. 2-4, 1996, Paris France, pp. 46-57 (Dec. 1996).
G. Ammons et al., “Exploiting Hardware Performance Counters with Flow and Context Sensitive Profiling,”Proceedings of the 1997 ACM SIGPLAN, Conference on Programming, Language Design and Implementation(PLDI), vol. 32, No. 5, pp. 85-96 (Jun. 1997).

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Dynamic code motion optimization and path tracing does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Dynamic code motion optimization and path tracing, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Dynamic code motion optimization and path tracing will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2980016

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.