Program optimization method, and compiler using the same

Data processing: software development – installation – and managem – Software program development tool – Translation of code

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C717S127000, C717S128000, C717S130000, C717S136000, C717S139000, C717S140000, C712S233000, C712S234000, C712S235000, C712S239000, C718S106000

Reexamination Certificate

active

06817013

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a computer program optimization method, and in particular, to a method for editing a program optimization area in order to efficiently perform optimization by rewriting a conditional branching portion of a program to obtain a portion for parallel execution.
2. Description of the Related Art
Generally, in the process during which the source code of a program written in a programming language is compiled, optimization of the program is performed in order to improve the speed at which the program can be executed by a computer.
A variety of optimization methods are in use today. For example, processors, such as CPUs produced by Intel and Hewlett Packard, which are compatible with IA-64 architecture and capable of performing parallel processing while using VLIWs (Very Long Instruction Words), can execute predicated instructions and perform parallel processing at the instruction level. These processors can delete a branching instruction and execute, in parallel, a branching instruction group in a complementary predicated manner (e.g., a conversion of an instruction group hereinafter referred to as an IF conversion). As a result of an IF conversion, the number of instructions can be reduced and a branching prediction failure can be avoided and thus program execution efficiency can be increased.
An IF conversion may also deteriorate execution efficiency, depending on how the instructions for the conversion are selected. The causes of a deterioration in efficiency may be an excessive parallel level extending beyond the hardware limit, a rise in the register pressure, an inferior balance of the critical path length of the branching instruction, or the insertion of a rarely executed instruction.
In order to determine the program execution efficiency level while taking these factors into account, a code schedule must be executed for each branching instruction, both for the case where an IF conversion is to be performed and for the case where an IF conversion is not performed, and the number of actual instruction cycles estimated for each of the two cases and compared.
However, when the number of instruction cycles is estimated and compared for all the branching instructions in a program, the number of combinations becomes enormous, and termination of the estimation process within a realistic calculation time is not possible. Thus, an area for the execution of an optimization (e.g., hereinafter referred to as a hyperblock and shown using a quadrilateral) must be appropriately selected.
Therefore, conventionally, two methods are employed in order to determine within a realistic time whether an IF conversion should be performed. In a first method, an IF conversion is performed, as needed, only for branching instructions along an execution path (e.g., a main trace) for which it is predicted that the execution probability is the highest. In a second method, an IF conversion is performed, as needed, for temporarily optimizing all the branching instructions, and for performing, as needed, a conversion in the direction opposite to that of an IF conversion (e.g., a reverse IF-conversion) during the list scheduling and the reproduction of the branching instructions,
The conventional technique for performing an IF conversion, as needed, only for branching instructions in the main trace (e.g., the first method) is disclosed in “Effective Compiler Support For Predicated Execution Using The Hyperblock” (S. A. Mahlke, R. E. Hank, R. A. Bringmann, in Proceedings of the 25th International Symposium on Microarchitecture, pp.45-54, December 1992).
The conventional technique of the first method provides a solution using a discovery method to determine for which area should parallel execution and an IF conversion be performed, in order. According to the article, first, the main trace is specified, and an IF conversion is unconditionally performed for this path. Then, a check is performed to determine whether each path other than the main trace (e.g., a sub trace) should be included in the same parallel execution area, and the area for which an IF conversion is performed is gradually increased.
Whether an IF conversion should be performed for a specific branching instruction is determined considering four conditions including whether in the sub trace there is an instruction that will disturb a pipe line, the probability that relative to the main trace a sub trace will be executed, the ratio of the number of instructions in the machine language of the sub trace to the number in the main trace, and the limits of the parallel processing hardware capabilities.
According to this method, when the number of branching instructions in the main trace is denoted by n, only the calculation amount proportional to n is required to terminate the determination performed to ascertain whether an IF conversion should be performed.
The conventional technique for temporarily optimizing all the branching instructions by performing an IF conversion and for performing a reverse IF-conversion (e.g., the second method), as needed, during the list scheduling and the reproduction of the branching instructions is disclosed in “A Framework for Balancing Control Flow And Prediction” (D. I. August, W. W. Hwu, S. A. Mahlke, in Proceedings of the 30th International Symposium On Microarchitecture, December 1997).
According to this article, first, the overall program is defined as a single parallel execution area, and an IF conversion is performed for all the branching instructions. Then, various optimization processes are performed for the obtained program, and a reverse IF-conversion is selectively performed. As a result, a state is provided where an IF conversion has been selectively performed for the branching instructions.
According to the second method, in coordination with the code scheduler, the number of execution cycles for each branching instruction is obtained both for a case where a reverse IF-conversion has been performed and a case where a reverse IF-conversion has not been performed. Then, the execution performances are compared to determine whether a reverse IF-conversion should be performed.
However, when this method is employed for all the branching instructions in a function, as well as during a determination performed to ascertain whether an IF conversion should be performed, the number of target instruction combinations for a reverse IF-conversion becomes enormous. Thus, a reverse IF-conversion is performed only when the critical path is to be scheduled in coordination with the list scheduler, so that an increase in the number of calculations is suppressed.
In the above-mentioned article by August et al., a method is proposed where scheduling is performed approximately 2n times, where n denotes the number of branching instructions. The critical path is the longest instruction string, in a specific range within a program, of the series of instruction strings that can not be executed in parallel.
However, while the first method, for performing an IF conversion as needed only for branching instructions in the main trace, can improve the execution efficiency for the main trace, it does not especially affect execution efficiency relative to the sub trace. Therefore, overall execution efficiency for a program cannot always be improved.
Further, where there is no particular path whose execution probability is high and for which a main trace can be strictly specified, it is difficult to determine a path along which an IF conversion should be performed. Additionally, even when an IF conversion is performed for a path based on a specific reference, the execution probabilities for other paths are also high. Thus, the program execution efficiency can not satisfactorily be increased.
Furthermore, according to the second method for temporarily optimizing all the branching instructions through an IF conversion, and for performing reverse IF-conversions as needed during the list scheduling and the reproduction of the branching instructions, the target p

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Program optimization method, and compiler using the same does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Program optimization method, and compiler using the same, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Program optimization method, and compiler using the same will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3324699

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.