Electrical computers and digital processing systems: processing – Architecture based instruction processing – Data flow based system
Reexamination Certificate
1994-08-24
2001-03-20
Beausoliel, Jr., Robert W. (Department: 2781)
Electrical computers and digital processing systems: processing
Architecture based instruction processing
Data flow based system
C712S200000, C712S206000
Reexamination Certificate
active
06205538
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates to computer systems, and more particularly, to a microprocessor having a counterflow pipeline and a renaming table.
BACKGROUND OF THE INVENTION
Microprocessors run user defined computer programs to accomplish specific tasks. All programs, regardless of the task, are composed of a series of instructions. State-of-the-art microprocessors execute instructions using a multi-stage pipeline. Instructions enter at one end of the pipeline, are processed through the stages, and the results of the instructions exit at the opposite end of the pipeline. Typically, a pipelined processor includes an instruction fetch stage, an instruction decode and register fetch stage, an execution stage, a memory access stage and a write-back stage. The pipeline increases the number of instructions being executed simultaneously, and thus the overall processor throughput is improved. A superscalar processor is a processor that includes several pipelines arranged to execute several instructions in a parallel fashion.
Control and data hazards are a problem with superscalar pipelined processors. Control and data hazards occur when instructions are dependent upon one another. Consider a first pipeline executing a first instruction and the first instruction specifies a destination register (X). A second instruction, to be executed by a second pipeline, is said to be dependent if it needs the contents of register (X). If the second pipeline were to use the contents of register (X), prior to the completion of the first instruction, an incorrect outcome may be obtained because the data in register (X) stored in a register file may be out-of-date, i.e., stale. Several approaches to avoid the data hazard problem are described in the above identified patent applications and which describe respectively a pipeline design for counterflow pipelined processor, a microprocessor architecture based on the same, and a scoreboard for the same.
The counterflow pipeline processor (CFPP) of the above identified applications depart from traditional pipeline designs in that information flow is bidirectional. Instructions are stored in an instruction cache. These instructions enter the counterflow pipeline in program order at a launch stage and proceed to a decoder for a determination of the instruction class, eg., branch, load, add and multiply. Next, the instructions proceed to a register exam stage where the source and destination operand registers, if any, are identified and a retrieval of necessary source operand value(s) from a register file is initiated. These source operand value(s) are retrieved in one of several ways from the register file and inserted into the top of the result pipe. Alternatively, the operand values can be transferred directly into the instructions.
Next, the instructions in the form of instruction packages enter and advance up the instruction pipe to be executed. Subsequently, register values are generated for the destination register operands of the instruction packages. These register values are inserted laterally into the respective result stages of the result pipe in the form of result packages which counterflow down the result pipe. As a younger (later in program order) instruction package meets a result package that is needed by that instruction package, that register value is copied. This copying process, which is referred to as “garnering”, reduces the stall problem common with scalar pipeline processors of the prior art. Hence, instruction packages flow up an instruction pipe of the counterflow pipeline while the register values from previous instruction packages flow down the result pipe of the same counterflow pipeline.
Variations of the counterflow pipeline are possible. For example, the instruction pipe and the result pipe, which together forms an execution pipe, can be implemented to interoperate asynchronously. One drawback of such an asynchronous design is the requirement of complex arbitration and comparing logic coupled between each instruction stage and corresponding result stages to guarantee that register value(s) do not overtake any younger instruction packages requiring those result packages. The advance of instructions packages up the instruction pipe and counterflow of the result packages down the result pipe must be properly arbitrated by the complex arbitration and comparing logic for two important reasons.
First, at every stage of the execution pipe, the arbitration and comparing logic ensures that a targeted result package does not overtake any younger instruction package requiring the corresponding targeted register value for one of its source operands. This is accomplished by ensuring that each required source register operand of an instruction package in an instruction stage is checked against any result package in a preceding result stages, before the instruction package and the compared result package are allowed to pass each other in the execution pipe. Arbitration at every stage of the execution pipe is time consuming and disrupts the concurrency between instruction package flow and result package flow in the execution pipe.
Second, there is a need to prevent younger instructions from garnering stale (expired) result packages. Stale result packages are those result packages with register values that have been superceded by new register values produced by younger instruction packages. Hence, upon a subsequent write to a destination operand register, younger instruction packages have the task of “killing”, i.e., invalidating any stale result packages, as the younger instruction packages advance up the instruction pipe.
The above described arbitration of the counterflow pipeline ensures that instruction packages and their respective result packages counterflow in an orderly manner. However, a typical execution pipe may be ten or more stages deep and the time penalty for arbitration can be substantial. Hence, there is a need for a more efficient counterflow pipeline architecture where the instruction and result packages can flow more concurrently, by eliminating the need for arbitration for “killing” of stale register values.
SUMMARY OF THE INVENTION
The present invention provides an efficient streamlined pipeline for a counterflow pipeline processor with a renaming table. The counterflow pipeline includes an execution pipe having multiple instruction stages forming an instruction pipe, a plurality of result stages forming a result pipe, and a corresponding plurality of comparator/inserters. Each comparator/inserter couples an instruction stage to a corresponding result stage. The counterflow pipeline also includes a register examination (exam) stage with the renaming table. The renaming table has assignment entries for associating register values of instructions with unique register identifiers, e.g., renamed register numbers (RRNs). As a result, the register values are distinguishable from each other, thereby minimizimg the need for complex arbitration and housekeeping (killing of stale register values), as younger (later in program order) instructions and their targeted register values counterflow in the streamlined counterflow pipeline. A counter, such as a modulo counter, is coupled to the renaming table and provides unique register identifiers for new assignments.
In one embodiment, an instruction advances up the counterflow pipeline until it reaches the register exam stage. For each source operand register that is not already represented by an entry in the renaming table, the RRN counter assigns a unique RRN to the operand register. Conversely, if the instruction includes a destination operand register, a new RRN is assigned to the operand register. The RRN assignments are recorded as entries in the renaming table. These RRNs are also recorded in the respective source register RRN field(s) and/or destination register RRN field of a corresponding instruction package.
Next, the instruction, in the form of the instruction package, enters the execution pipe and is processed by an instruction stage capable of execut
Albert Philip H.
Beausoliel, Jr. Robert W.
Phan Raymond N
Sun Microsystems Inc.
LandOfFree
Instruction result labeling in a counterflow pipeline processor does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Instruction result labeling in a counterflow pipeline processor, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Instruction result labeling in a counterflow pipeline processor will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2544425