Method and apparatus for delaying the execution of dependent...

Electrical computers and digital processing systems: processing – Dynamic instruction dependency checking – monitoring or...

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C712S202000

Reexamination Certificate

active

06463523

ABSTRACT:

BACKGROUND OF THE INVENTION
Many modern microprocessors allow instructions to execute out-of-order. This improves performance because it allows more instructions to complete in the same amount of time by efficiently distributing instructions among the computing resources of the microprocessor.
Problems may occur, however, when executing load and store instructions out-of-order. When a load instruction issues before an older store instruction referencing the same address, the load may retrieve an incorrect value because the store data the load should use is not yet present at the address. When hardware detects this condition, it performs a squash in which results from the load instruction and subsequent instructions are ignored and the instructions must be re-executed (replayed). Such hardware recovery degrades processor performance.
Memory reference tagging stores have been proposed in which information regarding the squash is used to predict subsequent memory instruction collisions, and consequently to prevent reordering of certain instructions as appropriate. This general method is described in U.S. Pat. No. 5,619,662 (Steely), “Memory Reference Tagging”, dated Apr. 8, 1997 (hereinafter '662 Patent), and incorporated herein by this reference in the entirety.
The '662 Patent describes a technique of detecting dependent out-of-order load and store instructions using a write buffer, which keeps track of executed load and store instructions until it is determined they have executed in their proper order. Four different approaches to handling such an out-of-order detection are described:
In the first approach, part of the referenced (or target) memory address is used as a tag to be associated with each of the out-of-order instructions. If these instructions later appear again in the instruction queue, the fact that they have identical tags will cause them to be issued in order.
The second approach uses an assigned “problem number” as a tag. Two instructions referencing the same memory address and with the same problem number are not re-ordered.
The third approach simply associates a tag bit with an instruction to indicate that other memory reference instructions should not be re-ordered around the tagged instruction.
Finally, the fourth approach turns off reordering for some number of instructions when entering a subroutine.
U.S. Pat. No. 5,615,350 (Hesson), “Apparatus to Dynamically Control the Out-of-Order Execution of Load-Store Instructions in a Processor Capable of Dispatching, Issuing and Executing Multiple Instructions in a Single Processor Cycle,” dated Mar. 25, 1997 describes a “store barrier” cache that is used to predict, upon fetching of a store instruction, whether execution of the store instruction will lead to an out-of-order violation. Upon such a prediction, execution of all load instructions is delayed until the store instruction executes.
Another solution to the problem of reordering dependent load instructions is to remember recently squashed loads, and to force those loads to wait for all prior stores the next time they are fetched.
Each of the above approaches suffers a performance degradation because the load is delayed for many more stores then is necessary. Squashes are expensive in that they consume valuable time re-executing code and therefore slow the net instruction execution in a processor. The techniques described above attempt to predict out-of-order load/store violations to avoid such costly out-of-ordering, but these techniques over-compensate by forcing a load to wait unnecessarily.
SUMMARY OF THE INVENTION
The present invention attempts to detect load/store dependencies early, and avoid even a first squash in many cases.
Accordingly, a method of reducing load/store execution order violations in an out-of-order processor comprises determining whether a source address of a load instruction is the same as a destination address of a store instruction on which execution the load instruction depends. If they are the same, then execution of the load instruction is delayed until execution of the store instruction. This determination can be made, for example, during a fetch cycle, by further determining whether any intervening instructions modify the contents of a register referenced by both the load instruction and the store instruction, where source and destination addresses are determined by offsets to a value contained within the register.
In a preferred embodiment, named or virtual registers are mapped to physical registers, in the mapping stage of an instruction pipeline. Each unmodified instance of a virtual register is mapped to a common physical register. Determining whether a source address of a load instruction is the same as a destination address of a store instruction is reduced to determining whether the physical registers used by the store and load instructions are the same, and whether the offsets used in the instructions are the same. This determination is performed during the mapping stage.
A table is provided, which has entries corresponding at least to instructions on which the load instruction may depend, and preferably corresponding to entries in the instruction queue. In each table entry corresponding to a store instruction, the store instruction's destination address offset and physical register reference are saved. Determining whether physical registers used by the store and load instructions are the same comprises comparing the load instruction's source address offset and physical reference with each of the table entries corresponding to store instructions.
Furthermore, a matrix is provided which has a row corresponding at least to each instruction on which the load instruction may depend, preferably to each entry in the instruction queue. In addition, each entry has an indicator for each instruction having an entry in the table. Thus, for an n-entry instruction queue, the matrix forms a n×n matrix, each row corresponding to an instruction in the instruction queue which may depend upon another instruction, and each column corresponding to an instruction in the instruction queue on which other instructions may depend. Upon determining that a load instruction is dependent upon an unexecuted store instruction, the indicator corresponding to the store instruction in the entry corresponding to the load instruction is “marked,” while an indicator is “unmarked” when the corresponding store instruction issues. Execution of any load instruction is delayed while any indicator in the load instruction's corresponding entry is marked. When a store instruction executes, all indicators in the column corresponding to the store instruction are unmarked.


REFERENCES:
patent: 5615350 (1997-03-01), Hesson et al.
patent: 5619662 (1997-04-01), Steely, Jr. et al.
patent: 5903739 (1999-05-01), Dice
patent: 5948095 (1999-09-01), Arora et al.
patent: 6070238 (2000-05-01), Feiste
patent: 6088774 (2000-07-01), Gillingham
patent: 6108770 (2000-08-01), Chrysos
patent: 6112019 (2000-08-01), Chamdani
patent: 6279100 (2001-08-01), Tremblay

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for delaying the execution of dependent... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus for delaying the execution of dependent..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for delaying the execution of dependent... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2998164

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.