Data processing: software development – installation – and managem – Software program development tool – Translation of code
Reexamination Certificate
1999-11-29
2003-03-25
Morse, Gregory (Department: 2122)
Data processing: software development, installation, and managem
Software program development tool
Translation of code
C717S159000, C717S160000, C717S151000
Reexamination Certificate
active
06539543
ABSTRACT:
BACKGROUND
The present invention relates to the field of computer software engineering. More specifically, it relates to a method and apparatus for compiling source code by exposing parallelism, thereby enabling the generation of more efficient compiled code, allowing more parallel instructions to be created and leading to better utilization of processor resources.
Compiling a computer program involves converting a high level language code (source code) into instructions which are intelligible to the processor of a computer (object code). One of the goals of a compiler is to generate an efficient implementation of the source code. In the case of processors which can execute more than one instruction in parallel, it is an objective of the compiler to compile the code such that as many instructions as possible can be executed in parallel.
It is an object of the invention to provide a method and apparatus for efficiently compiling code by exposing parallelism in that code. The present invention is applicable to any computer architecture having multiple functional units, for example, Very Long Instruction Word (“VLIW”) architectures for implementing real-time implementations of Digital Signal Processing (“DSP”) applications.
It is a further and more specific object of the invention to determine and represent dependencies for each of the memory accesses in a source program. This exposes the execution paths in the program so that the compiler can exploit the parallelism to the maximum extent. In the case of VLIW architectures, the failure to detect parallelism accurately can lead to fewer instructions being packed into a VLIW instruction word and hence the processor performing below its capabilities.
Index expressions present special problems for compilers. Consider the following piece of pseudo-code:
1 void f(
2 double A[
4
]
3 )
4 {
5 double a=A[
0
];
6 double b=A[
1
];
7 double c=a+b;
8 A[
0
]=c;
9
10 double d=A[
2
];
11 double e=A[
3
]
12 double f=d−e;
13 A[
2
]=f;
14 }
The block code starting with “double a=A[0]” can be executed in parallel with the code starting with “double d=A[2]” since they do not access the same memory locations in array A.
Now consider the following example of more complex parallelism:
1 double f(
2 double in
3 )
4 {
5 double A[
16
];
6 A[0]=in;
7 for (int i=0; i<16; i++) {
8 A[i+1]=A[i]*2.0
9 }
10 return A[
15
];
}
The indices for array A are now linear index expressions of the loop iterators (the linear expressions are i, i+1 and loop iterator i). The term “linear” as applied to an index expression means that the index expression is the result of a combination of a constant times the iterator plus a constant. This information can be used to determine if multiple iterations of a loop can be run in parallel—referred to as “array privatization.” In this example, the compiler can tell that the array value is produced in the previous iteration. It therefore cannot postpone the write operation until after the read of the next iteration.
Now consider the following example involving an induction variable (which is behaviorally equivalent to the previous example):
1 double f(
2 double in
3 )
4 {
5 double A[
16
];
6 A[
0
]=in;
7 int idx
1
=1;
8 int idx
2
=0;
9 for (int i=0; i<16; i++) {
10 A[idx
1
]=A[idx
2
]*2.0;
11 idx
1
=idx
1
+1;
12 idx
2
=idx
2
+1;
13 }
14 return A[
15
];
15
In this expression, a variable is initialized before a loop. The value used and updated inside the loop is called an induction variable. In some cases the induction variable can be converted into a linear index expression and the parallelism in the code thus exposed. This is the case here because the code is completely equivalent to that of the previous example.
Known methods of compiler optimization only consider linear index expressions—assuming a worst case of non-parallelism for non-linear expressions. Induction variables are handled by transformation into linear index expressions. Most systems of the prior art use heuristic approaches to obtain memory access information from such expressions. Prior art algorithms formulate the linear index expressions and a specific data flow analysis question as the proof of the existence of a solution of an Integer Linear Programming problem. Another known method for simplifying the Integer Linear Programming problem is Fourier-Motzkin variable elimination, which reduces the number of dimensions of the problem.
One benefit of the present invention is the ability to perform inter-procedural data flow analysis over multiple function calls. In the prior art, this was done by combining the results of intra- and inter-procedural data flow analysis in a post-processing step. In contrast, the present invention performs operations which flatten the hierarchy of function calls in order to determine data dependencies for memory accesses. Here, the flattening operation is performed prior to the analysis stage.
These and other advantages of the invention will be apparent from the following description and claims.
SUMMARY OF THE INVENTION
The present invention relates to a method of optimizing the compilation of a computer program. The method includes the steps of identifying an index path by noting the operations of the program which involve one or more index expressions. A non-hierarchical representation of the index path is created. The memory accesses in the non-hierarchical representation of the index path are examined to identify program steps that could be executed in parallel.
Where the program includes function calls, the step of identifying an index path involves noting operations in the function calls, and the non-hierarchical representation of the index path includes information relating to operations in the function calls.
In a further embodiment of the invention, the method includes the steps of merging the information relating to operations in the function calls back into a hierarchical representation of the index path.
In yet a further embodiment, a data structure of the non-hierarchical representation is created and interrogated with questions relating to memory accesses. The results are stored in or back annotated to a question data structure.
The invention also encompasses an apparatus for compiling computer code including a module which creates a signal flow data structure for index expressions used by the computer code. That module includes a means for identifying steps in the index path of the computer code, identifying parts of memory accessed by index expressions. A module removes the hierarchy from the computer code, creating a non-hierarchical representation of the computer code.
REFERENCES:
patent: 5437034 (1995-07-01), Tanaka et al.
patent: 5491823 (1996-02-01), Ruttenberg
patent: 5537620 (1996-07-01), Breternitz, Jr.
patent: 5842022 (1998-11-01), Nakahira et al.
patent: 6101488 (2000-08-01), Hayashi et al.
patent: 6148439 (2000-11-01), Nishiyama
patent: 6351845 (2002-02-01), Hinker et al.
patent: 6374403 (2002-04-01), Darte et al.
patent: 6393423 (2002-05-01), Goedken
patent: 6438747 (2002-08-01), Schreiber et al.
Gupta. Loop Displacement: An Approach for Transforming and Scheduling Loops for Parallel Execution. IEEE. 1990. pp. 388-397.*
Huang et al. Efficient Run-Time Scheduling For Parallelizing Partially Loop. IEEE. 1997. pp. 397-403.*
Manjikian et al. Fusion of Loops for Parallelism and Locality. IEEE. 1997. pp. 193-209.*
Rajiv Gupta,“Loop Displacement: An Approach for Transforming and Scheduling Loops for Parallel executions” University of Pittsburgh.
Tsung-Choan, Huang et al. “Efficient Run-Time Scheduling for Parallelizing Partially Parallel Loop” National Sun Yat-Sen University 1997.
Naraig Manjik
Du Pont Kurt
Guffens Jan
Adelante Technologies NV
Michael Schwarz & Associates P.C.
Morse Gregory
Zhen Wei
LandOfFree
Method and apparatus for compiling source code by flattening... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for compiling source code by flattening..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for compiling source code by flattening... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3084509