Method and apparatus for optimizing load memory accesses

Electrical computers and digital processing systems: processing – Dynamic instruction dependency checking – monitoring or... – Scoreboarding – reservation station – or aliasing

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

06772317

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to executing load instructions within a processor. More particularly, the present invention relates to a method and apparatus for optimizing load memory accesses using load reuse operations.
BACKGROUND OF THE INVENTION
Most instructions in a computer instruction set operate on several source operands to generate results. The instructions name, either explicitly or through an indirection, the source and destination locations where values are read from or written to. A name may be either a logical, or architectural, register or a location in memory.
Instructions involving register operands are faster than those involving memory operands. For some microprocessor architectures, instructions naming memory operands are translated, or decoded, into microinstructions that transfer operand values from memory to logical registers and then perform the decoded computations. The number of logical registers, however, often is limited, and, as a result, compilers should efficiently utilize logical registers to generate efficient code.
The number of physical registers available in a microprocessor typically exceeds the number of logical registers, so that register renaming may be utilized to increase performance. In particular, for out-of-order processors, register renaming allows instructions to be executed out of their original program order. Thus, for many out-of-order processors, an instruction is renamed so that logical registers named in the original instruction are renamed to physical registers.
Renaming a logical register involves mapping a logical register to a physical register. These mappings are stored in a Register Alias Table (“RAT”). A RAT maintains the latest mapping for each logical register. A RAT is indexed by logical registers, and provides mappings to corresponding physical registers. This activity may be called dependency tracking.
FIG. 1
depicts a register renaming and dependency tracking scheme involving three structures: RAT
110
, active list
102
, and free list
104
. For each logical register specified by a renamed instruction, an unused physical register from free list
104
is allocated. RAT
110
is updated with this new allocation. Physical registers are free to be used again, or reclaimed, once they cannot be referenced by instructions in the current instruction window.
Based upon the data structures depicted in
FIG. 1
, one method for register reclaiming is to reclaim a physical register when the instruction that evicted it from RAT
110
retires. Thus, the instruction that created the new allocation to the physical register is retired. As a result, whenever a new allocation updates RAT
110
, the evicted old allocation is pushed into active list
102
. An active list
102
entry is associated with each instruction in the instruction window. When an instruction retires, the physical register of the old allocation recorded in active list
102
, if any, is reclaimed and pushed into free list
104
. The cycle is depicted in FIG.
1
.
A scheme known as “result reuse” may be used to optimize the above-discussed process. Result reuse transforms the internal representation of the data-flow graph to significantly increase the level of instruction-level parallelism. Prior to renaming, whenever the result of an instruction is recognized to match the result of another instruction, the same physical register is used for both instructions. This scheme redirects all dependencies on both instructions towards the instruction that dynamically executes first. Result reuse relies on value-identity detectors. The detector outcome can be either safe or speculative. An example of a safe detector outcome is one directed to move instructions. Using value-identity detection, a move instruction can be completely eliminated from the execution stream. In such a case, it is safe to reallocate the physical register holding the source value because, by definition, the source and destination values are identical. An example of a speculative detector outcome is one directed to memory bypassing. Load instructions often collide with older store instructions in the instruction window of a processor. In such cases, the result of the load instruction is identical to the result that was stored in memory by the colliding store instruction. Predicting such value-identities for load instructions makes it possible to bypass memory accesses completely.
For any incoming instruction, the value-identity prediction structures may predict the location in the instruction window, or anywhere in the physical register space, of another instruction that produces the same result. In this case, the physical register allocated to this older instruction is retrieved from the instruction window, and reallocated for the incoming instruction.
The value identity predictor includes three parts. The first part establishes a potential identity relation between a pair of instructions. The second and third parts record and retrieve this value identity relation into/from the prediction structures. While general methods and structures exist for implementing the second and third parts, the first part typically is still done by an assortment of ad hoc methods for establishing the value identity.
Load instructions access memory locations to load data to the physical registers. The loaded data is, in turn, used in computations with the logical registers. The load instruction execution time is greater than accessing the physical registers. As a result, the number of load instructions may be significant in programs written for processor architectures. Therefore, it may be useful to provide for the efficient execution of load instructions with optimized register renaming and reclaiming schemes.


REFERENCES:
patent: 4992938 (1991-02-01), Cocke et al.
patent: 5430866 (1995-07-01), Lawrence et al.
patent: 5644746 (1997-07-01), Holt et al.
patent: 5694574 (1997-12-01), Abramson et al.
patent: 5721855 (1998-02-01), Hinton et al.
patent: 5881262 (1999-03-01), Abramson et al.
patent: 6061777 (2000-05-01), Cheong et al.
patent: 6266763 (2001-07-01), Witt et al.
patent: 6393546 (2002-05-01), Witt et al.
patent: 6463523 (2002-10-01), Kessler et al.
patent: 6505293 (2003-01-01), Jourdan et al.
patent: 6665792 (2003-12-01), Merchant et al.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for optimizing load memory accesses does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus for optimizing load memory accesses, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for optimizing load memory accesses will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3336124

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.