Data processing: software development – installation – and managem – Software program development tool – Translation of code
Reexamination Certificate
2000-12-07
2004-06-22
Chaki, Kakali (Department: 2124)
Data processing: software development, installation, and managem
Software program development tool
Translation of code
Reexamination Certificate
active
06754893
ABSTRACT:
BACKGROUND OF INVENTION
1. Field of the Invention
This invention relates to the field of Optimizing Compilers for computer systems; specifically, it relates to a method for collapsing the prolog and epilog of software pipelined loops.
2. Description of the Related Art
Software pipelining is key to achieving good performance software architectures that support Instruction Level Parallelism (ILP architectures), which are generally architectures that are capable of issuing parallel instructions. Instructions may be considered to be “parallel operations” if their operations overlap in time, such as when an instruction has delay slots.
Software pipelining improves loop performance by exposing and exploiting instruction-level parallelism between consecutive loop iterations. For example, a loop body may consist of three instructions (ins1, ins2, and ins3), a decrement, and a conditional branch back to the beginning. In the absence of software pipelining and assuming dependence constraints are met, a possible “schedule” for this code on a VLIW processor might be as follows:
loop:
ins1
ins2
||
dec n
; n = n−1
ins3
||
[n] br loop
; branch to loop iff n>0
(Note: The || operator denotes instructions that execute in parallel).
In this schedule, very little parallelism has been exploited because instructions “ins1,” “ins2,” and “ins3” must execute in order within a given loop iteration.
Software pipelining overlaps multiple consecutive iterations of the loop to improve throughput, and therefore performance. For instance, assuming that all dependence constraints are met, a possible pipelined version of the loop in the example above might look as follows:
loop:
sub n, 2, n
; exec. kernel n-2 times
ins1
; prolog stage 1
ins2 || ins1 || dec n
; prolog stage 2
;---------------------------------------------
kernel:
ins3 || ins2 || ins1 || [n] dec n || [n] br kernel
;---------------------------------------------
|| ins3
|| ins2
; epilog stage 1
|| ins3
; epilog stage 2
In the pipelined code above, the 3 cycle loop becomes a 1 cycle loop by parallelizing 3 consecutive iterations of the loop. The kernel of the loop acts as a pipeline, processing one “stage” of each of the iterations in parallel. The pipeline is primed and drained through the “prolog” and “epilog” code that surrounds the kernel.
In general, all prolog, epilog, and kernel stages consist of II cycles, where II is the “initiation interval.” In the example above, II=1. In other cases, II might be greater than 1.
In some cases, each stage may consist of multiple cycles. For instance, a kernel may be more than one cycle in length. For example, this may be due to hardware restrictions, such as the need to perform three multiplication operations when there are only two multipliers available. To accomplish this, two multiplications would be performed in parallel in one cycle of the kernel, and the third multiplication would be performed in the other cycle.
The kernel size may also be increased because of loop carried data dependencies in the loop being software pipelined. Future loop iterations cannot start until the current iteration completes the computation of a result required by the future iteration.
SUMMARY OF THE INVENTION
Therefore, a need has arisen for a system and method for collapsing the prolog and the epilog of software pipelined loops.
In accordance with one embodiment of the present invention, software-based techniques which reduce code expansion by rolling some or all of the prolog and/or epilog back into kernel are applied. This may be accomplished via “prolog collapsing” and “epilog collapsing.”
According to one embodiment of the present invention, a method for reducing the code size of a software pipelined loop having a prolog, a kernel, and an epilog is disclosed. This method involves the collapsing of an epilog and/or a prolog. Stages are processed inside-out—that is, starting with the stage closest to the kernel, and working out from the kernel. A stage can be collapsed (i.e., rolled into the kernel) if instructions that are present in either a previous stage, or in the kernel, can be either speculated or predicated. If a stage is encountered that cannot be completely collapsed, the process is complete.
According to another embodiment of the present invention, a method for reducing a code size of a software pipelined loop having a kernel, an epilog, and optionally, a prolog includes the following steps. The stages of the epilog may be evaluated inside-out. Instructions that are present in a reference stage, which may be the kernel or a previously evaluated stage of the epilog, but not in the selected stage, are evaluated. If the identified instructions can be speculated, they are noted as capable of being speculated. If the instructions are not capable of being speculated, it is determined if the instructions can be predicated. If the instructions can be predicated, they are marked as capable of being predicated. If instructions cannot be speculated or predicated, the stage cannot be collapsed. The method is repeated for all stages of the epilog until the epilog cannot be collapsed.
According to another embodiment of the present invention, a method for reducing a code size of a software pipelined loop having a prolog, a kernel, and, optionally, an epilog ,includes the following steps. The stages of the prolog may be evaluated inside-out. Instructions that are present in a reference stage, which may be the kernel or a previously evaluated stage of the prolog, but not in the selected stage, are evaluated. If the identified instructions can be speculated, they are noted as capable of being speculated. If the instructions are not capable of being speculated, it is determined if the instructions can be predicated. If the instructions can be predicated, they are marked as capable of being predicated. If instructions cannot be speculated or predicated, the stage cannot be collapsed. The method is repeated for all stages of the prolog until the prolog cannot be collapsed.
According to another embodiment of the present invention, a method for reducing a code size of a software pipelined loop having a prolog and a kernel, and, optionally, an epilog, the kernel having a plurality of cycles, includes the following steps. Stages may be processed from the inside-out. The innermost unprocessed cycle of a candidate stage is identified, and instructions that are present in a reference stage, which may be the kernel or a previously evaluated stage of the prolog, but not in the identified stage, are evaluated. If the identified instructions can be speculated, they are noted as capable of being speculated. If the instructions are not capable of being speculated, it is determined if the instructions can be predicated. If the instructions can be predicated, they are marked as capable of being predicated. If instructions cannot be speculated or predicated, the stage cannot be completely collapsed. The method is repeated for all cycles of all stages of the prolog until the prolog cannot be completely collapsed.
Consider the cycle on which the process got stuck. This becomes the current cycle. If this is not the innermost cycle of the current stage, it is determined whether a branch can be inserted. If so, the candidate stage is partially collapsed. If not, the next innermost cycle becomes the current cycle. The process is repeated until a branch can be inserted and a stage can be partially collapsed or an innermost cycle of a current stage is encountered.
A technical advantage of the present invention is that a method for collapsing the prolog and epilog of software pipelined loops is disclosed. Another technical advantage of the present invention is that code size is reduced. Another technical advantage of the present invention is that the method of the present invention makes code more efficient.
REFERENCES:
patent
Granston Elana D.
Stotzer Eric J.
Ward Alan S.
Zbiciak Joseph
Chaki Kakali
Vu Tuan A
LandOfFree
Method for collapsing the prolog and epilog of software... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method for collapsing the prolog and epilog of software..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for collapsing the prolog and epilog of software... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3359935