Electrical computers and digital processing systems: processing – Processing architecture – Superscalar
Reexamination Certificate
1995-12-21
2001-02-06
Donaghue, Larry D. (Department: 2783)
Electrical computers and digital processing systems: processing
Processing architecture
Superscalar
C712S218000, C712S244000, C712S235000
Reexamination Certificate
active
06185668
ABSTRACT:
BACKGROUND OF THE INVENTION
The present invention relates generally to a method and apparatus for the speculative execution of instructions in a computer central processing unit (CPU). More specifically, the present invention provides for the management of exceptions caused by resequenced CPU instructions.
As CPU designs have developed over the years, CPU designers have added more functional units to CPU architectures. For instance, modern superscalar CPUs execute multiple integer, floating point, and memory operations in one cycle. CPU efficiency increases when the program being executed utilizes a higher percentage of these units at any one time. Many modern computing systems thus benefit from their ability to execute more than one instruction at a time. Very long instruction word (VLIW) and superscalar CPUs represent two of the more popular architectures. However, taking full advantage of the computing power they offer can prove difficult. To fully exploit the computing power available, these CPUs force the programmer to either hand-code routines, use routines hand-coded by others, or use advanced program compilers. The first two methods are labor intensive and expensive, and are therefore often impractical. The preferred method uses a compiler written to take advantage of the given CPU's capabilities. While many features and advantages are offered by VLIW and superscalar techniques, one feature employed by both is speculative execution of instructions (or simply “speculative execution”).
Speculative execution is the term used to describe the execution of instructions prior to or during the evaluation of the branch controlling their execution, and is an important technique for enhancing instruction-level parallelism (the execution of more than one instruction at a time, also known as “ILP”) in compiled software. In superscalar and VLIW CPUs it is advantageous to maximize CPU utilization by identifying instructions which may be grouped together and executed simultaneously on the CPU's various execution units. Furthermore, it is advantageous to resequence instructions whose execution depend on the results of a branching instruction to facilitate this grouping. This resequencing is known as “speculative code motion”. “Speculative” refers to the fact that the results of the instructions executed may never be used and “code motion,” to the moving of the instructions to a position before the branching instruction. The resequenced instructions (known as “speculated instructions”) may be sequential with the branching instruction (termed the “fall-through stream”) or may be the branching instruction's target (termed the “target stream”).
Instruction-level parallelism, and CPU efficiency, may be increased by using idle execution units to execute these instruction sequences prior to and during the branch's evaluation. At a minimum, the instructions executed are those in the path most likely taken by the branch. This is determined by a prediction method which selects the instruction-stream most likely to be executed. This is known as “partial speculation,” as only one of the possible instruction streams is speculatively executed. More desirable is a CPU with the ability to execute instructions in both the fall-through stream and target stream (known as “full speculation”). Given the overhead involved in evaluating a branch, speculative execution is gaining in popularity.
Speculative execution thus involves the execution of one or more instructions before the evaluation of the preceding branch has been completed. The CPU executes instructions in advance, using otherwise idle instruction processing units. If the branch is taken in the predicted direction, parallelism is increased by the early execution of the speculated instructions. If the branch is not taken in the predicted direction, the results of the speculated instructions are simply discarded. Compiler control of such speculative code motion is known as “static scheduling” because the execution order is determined by the compiler prior to program execution. This is in contrast to “dynamic scheduling” in which path prediction and execution order are determined by the processor during program execution (e.g., the prediction is made during runtime and the selected instruction stream is speculatively executed).
Some currently available compilers are capable of scheduling the simultaneous execution of instructions on various execution units within a CPU. When such a compiler is scheduling instructions, the scope of scheduling is limited to basic blocks (blocks of code containing no control flow instructions (branches)). As branches are a common feature throughout most software, the size of the basic blocks scheduled by compilers tend to be small. A typical basic block size is commonly about 5 instructions. Speculative execution addresses this constraint by permitting the compiler to position speculated instructions before their controlling branch and so promote larger basic block sizes, and thus greater ILP and computational efficiency. Using full speculation, this is achieved by speculative code motion from both the fall-through and target streams to a point above the controlling branch. Currently, no commercially available CPUs implement full speculation.
Speculative execution must not change a program's behavior. To be a viable alternative, an architecture supporting full speculation must properly handle exceptions caused by speculated instructions. If a speculated instruction's execution will cause an exception, the exception must be postponed until the time when that instruction would have originally executed. Of course, if the instruction would not have executed due to the direction taken by the preceding branch, the CPU may ignore the exception. This delayed exception processing is now explained in greater detail.
To support exception handling with speculative execution, an architecture must provide speculative bits associated with the CPU's general purpose registers. Each speculative bit is simply a one-bit field associated with each general purpose register. In order to clearly explain exception handling in CPUs supporting speculative execution, the terms “generating” and “signaling” (of an exception) must be understood. Generation is the detection and logging of an exception condition resulting from an instruction's execution. A generated exception causes an exception signal when it is known that the instruction would have executed in the original (non-speculative) code sequence. Exception signaling causes the CPU to handle an exception condition by invoking exception processing which may result in abnormal program termination, invoking an exception handler, or other special actions being taken.
Exception generation and signaling are simultaneous for instruction streams on which the compiler has performed no speculative code motion. No change occurs in the program's structure. In contrast, speculative code motion may cause the separation of exception generation and signaling of the exception. This separation is accomplished through the use of a place-holder instruction (referred to as a “check_exception instruction” or “'sentinel”).
When an instruction is speculatively moved above its controlling branch, the compiler determines whether the instruction could cause an exception. If the speculated instruction's results (i.e., registers) are used only by that speculated instruction, a check_exception instruction will be placed in the speculated instruction's old position to signal any exceptions caused by the speculated instruction. If the results (registers) will be used by another speculated instruction, a single check_exception instruction may be used to signal exceptions caused by either speculated instruction.
This method may be applied recursively, so that only one check_exception instruction is needed to signal an exception by any group of speculated instructions which each use a given result (register). However, the subsequent use of that result (register) by o
Donaghue Larry D.
Intergraph Corporation
Townsend and Townsend / and Crew LLP
LandOfFree
Method and apparatus for speculative execution of instructions does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for speculative execution of instructions, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for speculative execution of instructions will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2595259