Electrical computers and digital processing systems: processing – Dynamic instruction dependency checking – monitoring or...
Reexamination Certificate
2001-01-05
2004-11-02
Coleman, Eric (Department: 2183)
Electrical computers and digital processing systems: processing
Dynamic instruction dependency checking, monitoring or...
Reexamination Certificate
active
06813705
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates generally to memory traffic optimization, and more particularly to a system and method for removing a partially redundant load arising across a sequence of blocks.
BACKGROUND OF THE INVENTION
With the increase in the gap between memory access time and CPU performance comes an increased interest in optimizations that reduce memory traffic. The removal of redundant load instructions is an example of a memory traffic optimization. A load to a register is redundant if it is preceded by a store that writes to the same address and the stored value has not been overwritten by another store prior to the load. As shown in
FIG. 1A
, a load instruction
20
loads a value into a register gr
5
from a memory location whose address is computed as the sum of the value held in a base register gr
2
and a displacement value, i.e.,
16
(gr
2
). The load
20
is redundant because it is preceded by a store
10
that writes the value in register gr
1
to the same address, i.e.,
16
(gr
2
), and the stored value is not overwritten by another store prior to the load
20
.
The redundant load can be removed by replacing it with an appropriate register copy instruction. As shown in
FIG. 1B
, the load
20
is replaced by a register copy
30
, which copies the value of register gr
1
into register gr
5
. This replacement results in a vast improvement in memory cycle time. Whereas a load instruction may take as many twenty to thirty memory cycles to complete, a register copy instruction may take only one or two.
Often a load instruction is redundant in some but not all instances of the load. There exists a set of program paths along which the load is redundant and a set of paths along which the load is not redundant. Such a load is called a “partially redundant” load.
FIG. 2A
shows an example of a load that is redundant only along one path. As shown in
FIG. 2A
, a joint point
25
funnels both the store instruction
10
and an instruction
40
to the load instruction
20
. Although the store instruction
10
and the load instruction
20
are redundant, the instruction
40
and the load instruction
20
may not be. As a result, the load instruction
20
cannot necessarily be replaced with a copy instruction
30
.
Previous approaches for removing partially redundant loads require that the set of paths along which the load is redundant can be statically determined. A number of compile-time strategies have been developed to remove partially redundant instructions by sometime expensive code transformations. The idea is to isolate the execution paths along which the load is redundant via code duplication and/or code motion. Enough copies of the load are generated such that along each path the load is either fully redundant (in which case it can be eliminated) or not redundant at all.
Previous techniques to remove partially redundant loads are based on two important assumptions that limit their effectiveness. First, the techniques are compile-time approaches that operate on some intermediate form. Second, all previous techniques require that the set of instances in which a partially redundant load is truly redundant can be statically determined.
The first assumption implies that all redundant loads that are created during register allocation and code generation (that is, after compile-time optimizations have been applied) will be missed. While it is conceptually possible to apply the compile-time techniques to object code, the code transformations that are required are too substantial to be implemented after code generation since they might end up requiring a complete re-generation of the code.
The second assumption is more severe. Even if only a single path is considered, it may not be possible to determine whether a load is redundant or not, due to the inability to disambiguate memory references statically. Consider the example in FIG.
2
B. Even after isolating the path of instructions
10
,
50
and
20
, it cannot be determined whether the load
20
is redundant due to the intervening store
50
. While path transformations are useful to isolate single paths, they do not help in exposing redundancies of the form of FIG.
2
B. The cause of the problem is that these redundancies involve an intervening memory reference that cannot be statically disambiguated.
SUMMARY OF THE INVENTION
Briefly, a method for optimizing instructions in a program consistent with the present invention identifies a series of instructions in which a first instruction either stores a value in a first register into a first memory location or loads a value from the first memory location into the first register, a second instruction subsequent to the first instruction stores a value in a second register into a second memory location, and a third instruction subsequent to the second instruction loads a value from the first memory location into a third register. A fourth instruction is inserted between the second and third instruction which copies the value in the first register into the third register. It is determined if the first memory location and the second memory location are the same, and the third instruction is conditionally executed depending on a result of the determination.
In another aspect of the invention, the third instruction is executed if the first and second memory locations are the same and is nullified if the first and second memory locations are different. In yet another aspect of the invention, the insertion of the fourth instruction is done during compile-time or during run-time. The determination may be made during run-time.
In another aspect of the invention, a fifth instruction is inserted between the fourth instruction and the third instruction which compares the first memory location and the second memory location, wherein the fifth instruction nullifies the third instruction if the first memory location and the second memory location are different.
In yet another aspect of the invention, a method for optimizing instructions in a program identifies a series of instructions in which a first instruction either stores a value in a first register into a first memory location or loads a value from the first memory location into the first register, a second instruction subsequent to the first instruction stores a value in a second register into a second memory location, and a third instruction subsequent to the second instruction loads a value from the first memory location into a third register. The third instruction is replaced with a fourth instruction copying the value in the first register into the third register if the first memory location and the second memory location are different or if the third instruction is a redundant load.
REFERENCES:
patent: 5537620 (1996-07-01), Breternitz, Jr.
patent: 5542075 (1996-07-01), Ebcioglu et al.
patent: 5854933 (1998-12-01), Chang
patent: 6202204 (2001-03-01), Wu et al.
patent: 6658559 (2003-12-01), Arora et al.
Bob Cmelik, et al.: “Shade: A Fast Instruction-Set Simulator for Execution Profiling”; SIGMETRICS 94-5/94; pp 128-137.
Thomas Ball, et al.; “Branch Prediction for Free”, ACM-SIGPLAN-PLDI-6/93; pp 300-313.
Thomas Ball, et al.; “Efficient Path Profiling”; 1072-4451/96 IEEE; pp 46-57.
Bala Vasanth
Banerjia Sanjeev
Duesterwald Evelyn
Coleman Eric
Hewlett--Packard Development Company, L.P.
LandOfFree
Memory disambiguation scheme for partially redundant load... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Memory disambiguation scheme for partially redundant load..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Memory disambiguation scheme for partially redundant load... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3343056