Data processing: software development – installation – and managem – Software program development tool – Translation of code
Reexamination Certificate
1997-03-20
2003-05-20
Dam, Tuan Q. (Department: 2124)
Data processing: software development, installation, and managem
Software program development tool
Translation of code
C717S154000, C717S161000
Reexamination Certificate
active
06567976
ABSTRACT:
FIELD OF THE INVENTION
The present invention pertains to a method for optimizing source code by a compiler so that the resulting compiled code can be run much faster. More particularly, the present invention relates to a method for unrolling two-deep loops with convex bounds and imperfectly nested code, and for unrolling arbitrarily deep nests with constant bounds and imperfectly nested code.
BACKGROUND OF THE INVENTION
Computers are being used today to perform a wide variety of tasks. Many different areas of business, industry, government, education, entertainment, and most recently, the home, are tapping into the enormous and rapidly growing list of applications developed for today's increasingly powerful computer devices. Computers have also become a key technology for communicating ideas, data, and trends between and among business professionals. These devices have become so useful and ubiquitous, it would be hard to imagine today's society functioning without them.
Computers operate by executing programs, or a series of instructions stored in its memory. These programs, and their series of instructions, are collectively referred to as software. Software is key to utility of computers. Software is what makes the computer devices function and perform useful tasks. Efficient software makes for effective machines, whereas inefficient software makes for difficult to use, less effective, slower machines. Thus, the utility of the computer device often hinges upon the quality of the software written for the device.
Initially, software is crafted by professionals referred to as programmers or software engineers. As programs have become larger and more complex, the task of writing software has become correspondingly more difficult. As a result, programmers typically code in “high level languages” to improve productivity. The use of high level language makes the task of writing extremely long and complex programs more manageable. The completed program, however, must eventually be translated into machine executable language in order to run on a computer. The process of translating the program written in high level language into a program written in machine language is referred to as compiling. The actual translation is performed by a software program referred to as a compiler. Programmers rely upon compilers to translate their programs which are written in high level language (e.g., Basic, Fortran, C++, Pascal, etc.) into a program comprised of machine executable code, known as “machine language.”
Basically, the compiler operates on the program written in high level language. The high level language program is referred to as source code. The compiler translates the source code into machine executable code. Ultimately, it is the machine executable code which will run on the computer. The speed and reliability of the executable code depends upon the sophistication of the compiler, i.e. how it “optimizes”. If the compiler is unsophisticated, the size of the executable code will be larger than necessary. Worse, execution speed and reliability, may also be affected. Hence, it is critical to the speed and efficiency of the program that the compiler thoroughly optimize the executable code during the translation process.
There are several different methods and techniques used to optimize source code for scientific applications, loop interchange, cache tiling or blocking, etc. The present invention pertains to an improved variation for performing an optimization technique commonly referred to as “unroll and jam.” With the prior art technique, perfectly nested loop nests with constant loop bounds in source code can be compiled more efficiently. A “loop” refers to a sequence of steps which is repeated over and over again until a condition is satisfied. Often, source code contains nested loops whereby an inner loop is executed as part of the sequence of steps within an outer loop. Nested loops, under certain circumstances, may be “unrolled and jammed” such that it produces an equivalent result, yet the total number of load or store operations is reduced. Load and store operations are costly because it takes a relatively long period of time to load and store data into the appropriate registers. Consequently, reducing the number of load and store operations increases the speed at which a compiled code can run.
An example is now offered to demonstrate the prior art unroll and jam operation. The following source code shows a two-deep perfectly nested loop:
DO i=1, N1
DO j=1, N2
a(i)=a(i) + b(j)
ENDDO
ENDDO
Each time through the inner “DO” loop, the same a(i) variable is used, but a different b(j) must be loaded. Hence, a separate load operation must be performed each time the statement “a(i)=a(i)+b(j)” is executed. The outer loop can be unrolled as follows:
DO i=1, N1−1, 2
DO j=1 to N2
a(i)=a(i) + b(j)
ENDDO
DO j=1, N2
a(i+1)=a(i+1) + b(j)
ENDDO
ENDDO
IF (i.eq.N1) THEN
DO j=1, N2
a(i)=a(i)+b(j)
ENDDO
ENDIF
By unrolling the “i” loop once, the same result is achieved. The difference is that the “i” loop is now stepping by two. Note that the unrolling process described above does not result in any changes to the statements. All statements are executed in the same order as before: (1,1), (1,2), . . . (1,N), (2,1), (2,2), . . . (2,N). The purpose for unrolling the source code is to prepare it for the jamming process. It should be noted that stepping by two poses a couple of problems. First, it must be ensured that the unrolled code does not overshoot and perform an extra step. Hence, one is subtracted from the upper bounds (N
1
−i). Second, there must be a mechanism for handling the case if N
1
is odd. This is accomplished by implementing a wind-down loop. The wind-down loop executes the last step in case N
1
is odd.
After unrolling, the source code is optimized by applying a “jamming” procedure. Jamming as applied to the above example produces the following code:
DO i=1, N1−1, 2
DO j=1, N2
a(i)=a(i) + b(j)
a(i+1)=a(i+1) + b(j)
ENDDO
ENDDO
IF (i.eq.N1) THEN
DO j=1, N2
a(i)=a(i)+b(j)
ENDDO
ENDIF
By jamming the two “j” loops into a single loop, the computation has been changed. Instead of performing (1,1), (1,2), . . . , (1,N), (2,1), (2,2), . . . (2,N); the new process runs (1,1), (2,1), (1,2), (2,2), (3,1), (3,2), etc. Now, b(1) which was used by both (1,1) and (2,1), no longer has to wait for the N iterations of “j” to happen. It can be loaded for (1,1) and used without having to be reloaded for (2,1). This is due to the fact that (1,1) and (2,1) occur one right after the other. Hence, jamming effectively reduces the total number of loads. In effect, the jamming procedure combines the unrolled loops. As a result, the unrolled and jammed version runs much faster than the original source code.
However, prior art unroll and jam procedures are limited to code having both constant loop bounds and whose loops also happen to be perfectly nested as well. Both of these are very strong restrictions. First non-constant loop boundaries (e.g. in the above example, if the loop bounds for the j loop had depended on i) are frequently found in source code. Second, it is very common for these to be statements nested in the outer loop, but not loops further in.
Thus, there is a need in the prior art for a compiler that can unroll and jam the more general case of source code having variable loop boundaries and non-perfectly nested statements. The present invention provides an elegant solution to these problems in important cases by implementing an “outer loop unrolling” process. With the present inven
Dam Tuan Q.
Silicon Graphics Inc.
Sterne Kessler Goldstein & Fox P.L.L.C.
LandOfFree
Method for unrolling two-deep loops with convex bounds and... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method for unrolling two-deep loops with convex bounds and..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for unrolling two-deep loops with convex bounds and... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3068394