Compare speculation in software-pipelined loops

Data processing: software development – installation – and managem – Software program development tool – Translation of code

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C717S153000, C717S154000, C717S156000, C717S159000, C717S161000

Reexamination Certificate

active

06615403

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Technical Field
The present invention relates to mechanisms for optimizing computer code, and in particular, to mechanisms for improving the performance of software-pipelined loops.
2. Background Art
Software pipelining is a method for scheduling non-dependent instructions from different logical iterations of a program loop to execute concurrently. Overlapping instructions from different logical iterations of the loop increases the amount of instruction level parallelism (ILP) in the program code. Code having high levels of ILP uses the execution resources available on modern, superscalar processors more effectively.
A loop is software-pipelined by organizing the instructions of the loop body into stages of one or more instructions each. These stages form a software-pipeline having a pipeline depth equal to the number of stages (the “stage count” or “SC”) of the loop body. The instructions for a given loop iteration enter the software-pipeline stage by stage, on successive initiation intervals (II), and new loop iterations begin on successive initiation intervals until all iterations of the loop have been started. Each loop iteration is thus processed in stages through the software-pipeline in much the same way that an instruction is processed in stages through a processor pipeline. When the software-pipeline is full, stages from SC sequential loop iterations are in process concurrently, and one loop iteration completes every initiation interval. Various methods for implementing software-pipelined loops are discussed, for example, in B. R. Rau, M. S. Schlansker, P. P. Tirumalai,
Code Geiteration Schema for Modulo Scheduled Loops
IEEE MICRO Conference 1992 (Portland, Oreg.) and in, B. R. Rau, M. Lee, P. P. Tirumalai, M. S. Schlansker,
Register Allocation for Software
-
pipelined Loops
, Proceedings of the SIGPLAN '92 Conference on Programming Language Design and Implementation, (San Francisco, 1992).
The initiation interval (II) represents the number of processor clock cycles (“cycles”) between the start of successive iterations in a software-pipelined loop. The minimum II for a loop is the larger of a resource II (RSII) and a recurrence II (RCII) for the loop. The RSII is determined by the availability of execution units for the different instructions of the loop. For example, a loop that includes three integer instructions has a RSII of at least two cycles on a processor that provides only two integer execution units. The RCII reflects cross-iteration or loop-carried dependencies among the instructions of the loop and their execution latencies. If the three integer instructions of the above-example have one cycle latencies and depend on each other as follows, inst
1
→inst
2
→inst
3
→inst
1
, the RCII is at least three cycles.
RSII and RCII are illustrated for the following code segment, which includes instructions from the IA64™ instruction set architecture (ISA) of Intel® Corporation of Santa Clara, Calif.:
blk6:
(1)
addi
V7
=1, V7
(2)
addi
V8
=1, V8
(3)
addi
V9
=1, V9
(4)
ld1
V13
=[V9]
(I)
(5)
cmp4.eq
p0, V19
=V13, r0
(CMP)
(6)
(V19)
ld1
V14
=[V7]
(7)
(V19)
sxt1
V10
=V14
(8)
(V19)
sxt1
V11
=V13
(9)
(V19)
cmp4.eq
V17, p0
=V10, V11
(CCMP)
(10)
(V17)
br
blk6
Here, (V
19
) and (V
17
) operate as predicates to gate the instructions that follow on and off.
Code segment (I) has an RSII of 3 cycles and an RCII of 5 cycles on an Itanium™ processor of Intel® Corporation. The RCII is determined by the chain of dependence edges (
9
)→(
5
)→(
6
)→(
7
)→(
9
), assuming a 2 cycle latency for instruction (
6
) (ld
1
) and a one cycle latency for the remaining instructions. The RSII is determined by the execution resources provided by the Itanium™ processor.
A software-pipelined loop has its maximum ILP when its RCII is less than or equal to its RSII. This is difficult to achieve for loops that include control flow operations within the loop body. Control flow operations are often implemented through predicates that are evaluated by compare instructions (CMPs), and available compilers do not allow these CMPs to be speculated. An instruction is speculated when it is executed before the processor determines that the instruction needs to be executed. In software-pipelined loops, instructions from multiple loop iterations execute in parallel, and instructions from later iterations may be executed unnecessarily if the loop terminates at an earlier iteration. Speculating a CMP within a software-pipelined loop entails significant overhead to ensure that any non-speculative operations gated by a speculated CMP are canceled if the iteration containing the speculated CMP is not reached.
In code segment (I), for example, instruction (
5
) (CMP) determines a predicate value, V
19
, which activates/deactivates instructions (
6
) through (
9
), and instruction (
9
) (CCMP) determines whether the loop repeats or terminates. A conventional compiler includes loop-carried dependence edge (
9
)→(
5
) in the data dependence graph (DDG) for code segment (I). The loop-carried edge ensures that when code segment (I) is modulo-scheduled, CMP for the n
th
loop iteration does not execute until CCMP for the (n−1)
st
iteration determines that the n
th
iteration is reached. This strategy simplifies bookkeeping for software-pipelined loops, but it may also lead to unnecessarily large RCIIs for the loops, which can reduce performance.
The present invention addresses these and other problems associated with software-pipelined loops.


REFERENCES:
patent: 5450585 (1995-09-01), Johnson
patent: 5491823 (1996-02-01), Ruttenberg
patent: 6202204 (2001-03-01), Wu et al.
Lam, Monica, “Software Pipelining: An Effective Scheduling Technique for VLIW Machines,” Jun. 1988, ACM Press, ACM SIGPLAN '88.*
Rogers, Anne et al., “Software Support for Speculative Loads,” Oct. 1992, ACM Press, ASPLOS V, p. 42.*
Allan, Vicki H. et al., “Software Pipelining,” Sep. 1995, ACM Press, ACM Computing Surveys vol. 27, No. 3, pp. 383-384.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Compare speculation in software-pipelined loops does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Compare speculation in software-pipelined loops, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Compare speculation in software-pipelined loops will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3111387

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.