Electrical computers and digital processing systems: processing – Processing architecture – Superscalar
Reexamination Certificate
1999-03-30
2001-08-28
Maung, Zarni (Department: 2758)
Electrical computers and digital processing systems: processing
Processing architecture
Superscalar
C712S024000
Reexamination Certificate
active
06282629
ABSTRACT:
BACKGROUND OF THE INVENTION
This invention relates generally to computer systems and more particularly to processing of instructions in pipelined multiple execution processors.
As it is known in the art, computer systems have become ubiquitous. In particularly, one type of computer system widely employed is the so-called pipelined processor. In a pipelined processor instructions are decomposed into assembly like stages. Illustratively, a pipeline includes an instruction fetch stage in which instructions are fetched in one or several cycles from a cache memory; An instruction decode stage in which an instructions op code, that is the portion of the code which determines the function of the instruction, is examined to ascertain the function of the instruction and thus the resources needed by the instruction. Illustrative resources may include general purpose registers within the CPU access to internal buses as well as external buses and functional units such as I/O units and arithmetic logic units and so forth. A third stage is typically the instruction issue stage in which resource availability is checked for each instruction and resources are reserved for particular instructions. The fourth stage of a typical parallel pipeline is the execution stage in which instructions are executed in one or several execution stages, writing results into the general purpose registers during the last execution stage.
In an ideal pipelined processor, time is measured in CPU clock periods. In theory, the clock period for a P-stage pipeline would be
1
/P the clock period for a non-pipelined equivalent since the non-pipeline equivalent would have P-
1
less stages of execution for the instruction. Thus, with the pipelined approach there is the potential for a P times improvement in throughput or performance over a conventional non-pipelined architecture.
There are several practical limitations on pipeline performance however which prevents a pipelined processor from achieving the P times throughput improvement. One particular limitation on practical performance is instruction dependencies. Instruction dependencies may be viewed as instructions which depend on the results of previous instructions and may therefore have to wait for the previous instructions to complete before they can proceed through the pipeline. Two types of instruction dependencies may be identified.
The first one is the so-called data dependency which occurs when instructions use the same input and or output operands as for example when an instruction uses the result of a proceeding instruction as an input operand. A data dependency may cause an instruction to wait in the pipeline for a preceding instruction to complete. A control dependency on the other hand, occurs when a control decisions such as for example a conditional branch decision must be made before subsequent instructions can be executed.
When an instruction dependency occurs, all instructions following the instruction being executed are blocked from executing and typically the instruction in the pipeline is stalled in a pipeline stage and does not proceed further until the instruction ahead of it proceeds, or at the issue stage until all resources and data dependencies for the particular instruction are satisfied. When a large number of instruction dependencies occur in a program executed by a pipeline processor, the performance improvement implicit in a pipeline processor are reduced accordingly.
One technique known in the art to overcome the occurrence of instruction dependencies is so-called instruction scheduling. An important characteristic of instruction scheduling in pipelined processors is that by using equivalent but reordered code sequences, the pipeline processor can provide an improved performance by eliminating many of the so-called instruction dependencies. This is often done in an instruction scheduler in which the instructions are reordered by some algorithm to eliminate register conflicts that appear to the hardware as dependencies and to reduce the occurrence of data dependencies by executing instructions in a rescheduled order.
With pipelined computer systems in general, there is a delay until actions are taken while the various pipelines within the computer system are filled. Thus, it is desirable to keep the pipeline full so that each stage is continuously processing instructions.
To maintain the pipeline full, the pipeline processor often requires circuits to provide predictions such as which sequence of instructions to follow when processing a branch type of instruction. If the prediction concerning the instructions sequence is correct executions of the instruction follows without incident. However if a wrong path of the instruction sequence is followed after a conditional branch instruction for example then the pipeline must be backed up to the conditional branch instruction and the instructions of the correct path are required to be loaded into the pipeline. For execution, it is required that all of the operands that were previously used in the pipeline at the time prior to the execution of the of the conditional branch must be restored to the pipeline.
One of the problems that arises with the backing up of the pipeline is that the values of the registers used in the execution of the instructions in general will change between the time the conditional branch instruction is performed and the time the error condition which indicated that the conditional branch taken was incorrect is recognized.
In modern computers, registers are used to store values of operands. A typical instruction processed by a pipeline will add the values stored in two different registers and place the resulting value in one of the two registers or a third register. For example, an instruction may be “add R
7
, R
8
>R
9
”. This instruction is interpreted to mean that the contents of register
7
and the contents of register of
8
are added together and the result is placed in the register
9
.
The problem with pipelined systems, conditional branches, and the use of registers is that often a subsequent-instruction called for use of the register that was changed since the early instruction passed through the pipeline. Thus, as an example, assuming a latter or subsequent instruction following the instruction “add R
7
, R
8
>R
9
” is an instruction add R
1
, R
2
>R
7
. By executing the second subsequent instruction the contents of register
7
have been changed. If it is necessary to back up the instruction stage to the first instruction add R
7
, R
8
>R
9
as a result of the conditional branch or other condition in the central processing unit, the contents of R
7
which were changed by the second subsequent instruction add R
1
, R
2
>R
7
will provide an incorrect result. Thus, in general it would not be possible to back up the instruction pipeline to the instruction add R
7
, R
8
>R
9
and provide a valid result in register R
9
. That is, if the first instruction in which the contents of register
7
and register
8
are added were to be performed after execution of the second instruction due to a back up of the pipelined processor the resulting value stored in R
9
would be different than the valve which should have been stored in R
9
.
As it is also known in the art, in pipeline computer systems many instructions can be processed through the pipeline before a misprediction or other trapped condition is recognized by control logic in the pipeline computer. The value contained in any one of the registers which may be used by-the set of instructions may have thus changed may times before the trapped condition is recognized. Therefore there is a need to save results generated at certain pipeline stages until the complete pipeline has run its course in the event that a trap or other condition occurs which requires the pipeline to be backed up to a previous known state to execute a previous prior executed instruction.
One approach for saving the results is “register renaming.” In any processor execution box there are a plurality of registers for example 32
Compaq Computer Corporation
Hamilton Brook Smith & Reynolds P.C.
Maung Zarni
LandOfFree
Pipelined processor for performing parallel instruction... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Pipelined processor for performing parallel instruction..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Pipelined processor for performing parallel instruction... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2522859