Electrical computers and digital processing systems: processing – Processing control – Branching
Reexamination Certificate
2000-04-06
2002-07-23
Treat, William M. (Department: 2183)
Electrical computers and digital processing systems: processing
Processing control
Branching
C712S240000
Reexamination Certificate
active
06425076
ABSTRACT:
COPYRIGHT NOTICE
A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
MICROFICHE APPENDIX
A microfiche appendix containing one (1) sheet and thirty-one (31) frames is included as an appendix to this application and is hereby incorporated by reference in its entirety for all purposes. The microfiche appendix is directed to code listings containing an embodiment of the invention.
BACKGROUND OF THE INVENTION
The present invention relates generally to the field of computer-instruction prediction and, in particular, to instruction prediction based on filtering.
Branch prediction, a particular type of instruction prediction, has become critical to the performance of modern pipeline microprocessors. As pipelines grow in length, instruction fetch (performed in one stage of a pipeline) moves farther away from instruction execution (performed in another stage of the pipeline). Conditional branches (also referred to as conditional jumps) are one of the few operations where instruction execution affects instruction fetch. If instruction fetch must wait for execution of a conditional branch before proceeding, considerable performance is lost due to the number of pipeline stages between the two. As a result, conditional branches are typically predicted in an instruction fetch unit as taken or not-taken with a mechanism independent of instruction execution. Based on this prediction, subsequent instructions are speculatively fetched.
However, branch prediction is often wrong. In many cases, therefore, speculative instructions predictively fetched must be “killed” and instructions from the correct path subsequently fetched as replacements. Thus, the misprediction rate of a branch predictor is a critical parameter for performance. (Another important parameter is the cost of a misprediction, which is usually related to the number of pineline stages between fetch and execution.)
FIG. 1
illustrates the general interface between a conventional branch predictor
102
and a conventional microprocessor or any other computer system in which predictor
102
may reside (referred to herein as a “host processor”
103
). Typically, branch predictor
102
resides within a host processor. However, for ease of discussion,
FIG. 1
shows predictor
102
coupled to host processor
103
. Standard control signals between predictor
102
and processor
103
, well known to those having ordinary skill in the art, are omitted for clarity of discussion.
Through the use of a program counter (not shown), host processor
103
supplies a conditional branch-instruction address or portion thereof (i.e., “BranchPC”
104
), and the predictor responds with a prediction (also referred to as a “prediction value”)
106
and some state information; i.e., StateOut
108
. This state information is associated with a particular BranchPC and includes information necessary to update predictor
102
after an associated conditional branch instruction is executed.
More specifically, upon execution of the associated conditional branch instruction (i.e., when the subject condition becomes known), processor
103
generates an actual outcome value
110
(e.g., a single bit indicating whether the branch is taken or not-taken) and returns this to predictor
102
along with StateIn
108
′ through a feedback loop
105
. StateIn
108
′ is the same information provided as StateOut
108
for the particular BranchPC
104
; this information has been maintained within processor
103
until the associated conditional branch instruction has been executed and outcome value
110
is available. Predictor
102
will use StateIn
108
′ for updating purposes if necessary. For example, StateIn
108
′ and StateOut
108
(i.e., state information) may include an address for a memory (i.e., table) within predictor
102
that is associated with the subject conditional branch instruction, and a is used to store the associated outcome value
110
within the memory. An example of a branch predictor disposed within a processor is the MIPS R10000 microprocessor created by Silicon Graphics, Inc., of Mountain View, Calif.
Methods for branch prediction are evolving rapidly because the penalty for misprediction and performance requirements for processors are both increasing. Early branch prediction simply observed that branches usually go one way or the other, and therefore predicted the current direction (i.e., taken
ot-taken) of a conditional branch to be the same as its previous direction; so-called “last-direction prediction.” This method requires only one bit of storage per branch.
On a sample benchmark (i.e., the 126.gcc program of SPECint95 available from the Standard Performance Evaluation Corporation) simulating a predictor with a 4 KB table (i.e., a memory disposed within the predictor for holding predictions associated with particular conditional branch instructions), such last-direction prediction had a 15.6% misprediction rate per branch.
A simple improvement to last-direction prediction is based on the recognition that branches used to facilitate instruction loops typically operate in a predictable pattern. Such branches are typically taken many times in a row for repeated execution of the loop. Upon reaching the last iteration of the loop, however, such branch is then not-taken only once. When the loon is re-executed, this cycle is repeated. Last-direction prediction mispredicts such branches twice per loop: once at the last iteration when the branch is subsequently not-taken, and again on the first branch of the next loop, when the branch is predicted as not-taken but is in fact taken.
Such double misprediction can be prevented, however, by using two bits to encode the history for each branch. This may be carried out with a state machine that does not change the predicted direction until two branches are consecutively encountered in the other direction. On the sample benchmark, this enhancement lowered the simulated misprediction rate to 12.1%. This predictor is sometimes called “bimodal” in the literature.
Additional improvements to branch prediction include the use of global and/or local “branch history” to pick up correlations between branches. Branch history is typically represented as a finite-length shift register, with one bit for each taken
ot-taken outcome shifted into the register each time a branch is executed. Local history uses a shift register per branch and exploits patterns in the same to make predictions. For example, given the pattern 10101010 (in order of execution from left to right) it seems appropriate to predict that the next branch will be taken (represented by a logic one). Global history, on the other hand, uses a single shift register for all branches and is thus a superset of local history.
A variety of methods have been suggested for utilizing history in branch prediction. Two representative methods for local and global history are called “PAG” and “GSHARE,” respectively. These methods are further described in one or more of the following: Yeh, et al., “A Comparison of Dynamic Branch Predictors That Use Two Levels of Branch History,”
The
20
th Annual International Symposium on Commuter Architecture,
pp. 257-266, IEEE Computer Society Press (May 16-19, 1993); Yeh, et al., “Alternative Implementations of Two-Level Adoptive Branch Predictions,”
The
19
th Annual International Symposium on Computer Architecture,
pp. 124-134, Association for Computing Machinery (May 19-21, 1992); and S. McFarling, “Combining Branch Predictors,”
WRL Technical Note TN
-36, Digital Equipment Corp. (1993) (“McFarling”), each of which is hereby incorporated by reference in its entirety for all purposes.
On the sample benchmark, PAG and GSHARE lowered the simulated misprediction on rate to 10.3% and 8.6%, respectively.
MIPS Technologies Inc.
Townsend and Townsend / and Crew LLP
Treat William M.
LandOfFree
Instruction prediction based on filtering does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Instruction prediction based on filtering, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Instruction prediction based on filtering will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2885448