Method of constructing and unrolling speculatively counted...

Data processing: software development – installation – and managem – Software program development tool – Translation of code

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C717S151000, C717S160000, C712S241000, C712S233000, C712S239000

Reexamination Certificate

active

06539541

ABSTRACT:

FIELD OF THE INVENTION
This invention relates to the field of computer software optimization. More particularly, the present invention relates to a method of constructing and unrolling speculatively counted loops. BACKGROUND OF THE INVENTION
Computer programs are generally created as source code using high-level languages such as C, C++, Java, FORTRAN, or PASCAL. However, computers are not able to directly understand such languages and the computer programs have to be translated or compiled into a machine language that a computer can understand. The step of translating or compiling source code into object code process is performed by a compiler. Optimizations are mechanisms that provide the compiler with equivalent ways of generating code. Even though optimizations are not necessary in order. for a compiler to generate code correctly, object code may be optimized to execute faster than code generated by straight forward compiling algorithm if code improving transformations are used during code compilation. Loop unrolling is one such optimization that can be used in a compiler.
For the purpose of loop unrolling, loops are categorized as follows. A loop is counted if the number of iterations that the loop will execute is determined once execution reaches the loop. Counted loops are also referred to as “for” loops. Conversely, a loop has data dependent exit, loosely called a “while” loop, if the number of iterations is determined during the execution of the loop. Counted loops are further classified. If the compiler can determine the number of iterations that the loop will execute at compile time, then the number of iterations is a compile time constant. Otherwise, the number of loop iterations is variable.
A compiler can optimize counted loops better than “while” loops. Applying the counted loop optimization to a “while” loop will cause the compiler to generate incorrect code. Therefore, in order to ensure the generation of correct code, the compiler's default assumption must be that all loops are “while” loops. Then the compiler may later try to prove that a loop is a counted loop so that more optimizations become possible. Similarly, the compiler can optimize a compile time constant counted loop to execute more efficiently than a variable counted loop. Furthermore, applying compile time constant loop optimizations to a variable counted loop will generate incorrect code. The compiler's default assumption has to be that all loops are variable, and only if the compiler succeeds in proving that a counted loop is a compile time constant counted loop, can the compiler proceed to apply further optimizations.
For example, here are two possible optimizations that a compiler can apply only to compile time constant counted loops. In one possible optimization, the compiler may unroll the loop entirely. Typically, compilers will unroll a loop entirely if the trip count is determined to be small, e.g. eight or less. A second optimization that a compiler may apply to loops with large compile time constant trip counts is to chose an unrolling factor that divide the trip count evenly. If the loop is variable or if such an optimal factor can not be found (e.g. if the trip count is a large prime number), then a cleanup loop must be generated after the unrolled loop to execute the remainder of the iterations.
Compilers can also optimize counted loop to execute more efficiently than “while” loops. In the context of loop unrolling, when the compiler unrolls a “while” loop, the compiler has to simply copy the whole loop as many times as given by the unrolling factor chosen. This copy step includes the loop overhead. To illustrate, consider the following scheme:
LOOP:
BODY(I)
I=SOME_NEW_VALUE(I)
If (CONDITION(I))
GOTO LOOP
ELSE
GOTO LOOP_END
LOOP_END:
BODY(I) is the useful part of the loop that does the real work in an iterative way. CONDITION is some test statement involving a variable “I” that changes in at least some of the loop iterations and that determines whether the loop terminates or continues execution. Unrolling this “while” loop by an unrolling factor of three yields the following construct:
LOOP:
BODY(I)
I=SOME_NEW_VALUE(I)
If (NOT CONDITION(I))
GOTO LOOP_EXIT
BODY(I)
I=SOME_NEW_VALUE(I)
If (NOT CONDITION(I))
GOTO LOOP_EXIT
BODY(I)
I=SOME_NEW_VALUE(I)
IF (CONDITION(I))
GOTO LOOP
ELSE
GOTO LOOP_EXIT
LOOP_EXIT:
When a compiler unrolls a counted loop, the compiler can save the loop overhead. The compiler can generate loop overhead code only once in each new iteration that corresponds to several original iterations. Consider the following counted loop construct:
I=0;
N=some_unknown_value;
LOOP:
BODY(I)
I=I+1
If (I<N)
GOTO LOOP
ELSE
GOTO LOOP_EXIT
LOOP_EXIT
Assume that the compiler decided to unroll this loop by an unrolling factor of three. The compiler has to generate code that will verify, at execution time, that the loop is about to execute at least three iterations. Also, the upper bound in the unrolled loop must now be reduced to N−2, and a cleanup loop must be generated to execute the remainder of the iterations. The resulting code will look like:
I=0
N=some_unknown_value
If (N<3) GOTO IN_BETWEEN
LOOP:
BODY(I)
BODY(I+1)
BODY(I+2)
If (I<N−2) GOTO LOOP
ELSE GOTO IN_BETWEEN
IN_BETWEEN:
IF (I>=N) GOTO LOOP_EXIT
CLEANUP:
BODY(I)
I=I+1
IF(I<N)
GOTO CLEANUP
ELSE
GOTO LOOP_EXIT
LOOP_EXIT
If the value of ‘N’ is large enough, most of the execution time will be spent in the unrolled loop. The added control around the loop has a negligible effect on performance. Significant performance is gained from not having to execute the loop overhead. Hence the compiler's ability to prove that a given loop is counted is a key in achieving this performance gain.
SUMMARY OF THE INVENTION
A method of constructing and unrolling speculatively counted loops is described. The method of the present invention first locates a memory load instruction within the loop body of a loop. An advance load instruction is inserted into the preheader of the: loop. The memory load instruction is replaced with an advanced load check instruction. The loop body is unrolled. A cleanup block is generated for said loop.
Other features and advantages of the present invention will be apparent from the accompanying drawings and from the detailed description that follow below.


REFERENCES:
patent: 5361354 (1994-11-01), Greyzck
patent: 5526499 (1996-06-01), Bernstein et al.
patent: 5537620 (1996-07-01), Breternitz, Jr.
patent: 5613117 (1997-03-01), Davidson et al.
patent: 5664193 (1997-09-01), Tirumalai
patent: 5724536 (1998-03-01), Abramson et al.
patent: 5778210 (1998-07-01), Henstrom et al.
patent: 5797013 (1998-08-01), Mahadevan et al.
patent: 5802337 (1998-09-01), Fielden
patent: 5809308 (1998-09-01), Tirumalai
patent: 5835776 (1998-11-01), Tirumalai et al.
patent: 5842022 (1998-11-01), Nakahira et al.
patent: 5854933 (1998-12-01), Chang
patent: 5854934 (1998-12-01), Hsu et al.
patent: 5862384 (1999-01-01), Hirai
patent: 6035125 (2000-03-01), Nguyen et al.
patent: 6145076 (2000-11-01), Gabzdyl et al.
patent: 6192515 (2001-02-01), Doshi et al.
patent: 6247173 (2001-06-01), Subrahmanyam
patent: 6263427 (2001-07-01), Cummins et al.
patent: 6263489 (2001-07-01), Olsen et al.
patent: 6269440 (2001-07-01), Fernando et al.
patent: 6289443 (2001-09-01), Scales et al.
patent: 6327704 (2001-12-01), Mattson, Jr. et al.
patent: 6343375 (2002-01-01), Gupta et al.
patent: 6367071 (2002-04-01), Cao et al.
patent: 6401196 (2002-06-01), Lee et al.
TITLE: Improving Instruction Level Parallelism by loop unrolling dynamic memory disambiguation, ACM, Davidson et al, Dec. 1995.*
TITLE: Combining Loop Transformation considering caching and scheduling, ACM, author: Wolf et al, 1996.*
TITLE: Unrolling Loops with indeterminate loop counts in system level pipelines, Guo et al, IEEE, 1998.*
TITLE: Symbolic range propagation, author: Blume et al, IEEE, 1995.*
“Advanced Compiler Design & Implementation” by Steven S. M

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method of constructing and unrolling speculatively counted... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method of constructing and unrolling speculatively counted..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method of constructing and unrolling speculatively counted... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3073109

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.