Electrical computers and digital processing systems: processing – Dynamic instruction dependency checking – monitoring or... – Reducing an impact of a stall or pipeline bubble
Reexamination Certificate
1999-01-14
2001-03-20
Coleman, Eric (Department: 2783)
Electrical computers and digital processing systems: processing
Dynamic instruction dependency checking, monitoring or...
Reducing an impact of a stall or pipeline bubble
C712S023000
Reexamination Certificate
active
06205542
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates generally to pipelined processors, and more particularly, to a replay mechanism for a processor pipeline.
2 Description of the Related Art
Computers and many other types of machines are engineered around a “processor” that executes programmed instructions stored in the machine's memory. One may categorize computers and processors by the complexity of their instruction sets, such as reduced instruction set computers (“RISC”) and complex instruction set computers (“CISC”). An architecture is a categorization defining the interface between the processor's hardware and the processor's instruction set.
A first aspect of a processor's architecture is whether it executes instructions sequentially or out of order. Historically, processors executed one instruction at a time or in the same sequential order that code for the instructions was presented to the processor. This architecture is the “sequential programming model.” An out of order architecture executes instructions in an order different from the order in which the code is presented to the processor, i.e., non-sequentially.
The sequential nature of software code creates “data dependencies” and “control dependencies.” A data dependency occurs when a later instruction manipulates an operand x, and the data at x is a result from an earlier instruction. The later instruction has a data dependency on the operand of the earlier instruction. A control dependency occurs when an instruction can generate two alternative branches of instructions only one of which will be executed. Typically, the branch choice depends on a condition. The various architectures respect these data and control dependencies.
A second aspect of a processor's architecture is whether instruction processing is “pipelined.” In pipelined processing, the processor fetches instructions from memory and feeds them into one end of the pipeline. The pipeline has several “stages,” each stage performing some function necessary or desirable to process the instruction before passing the instruction to the next stage. For instance, one stage might fetch an instruction, the next stage might decode the instruction, and the next stage might execute the decoded instruction. Each stage typically moves the instruction closer to completion.
A pipeline may offer an advantage in that one part of the pipeline is working on a first instruction while a second part of the pipeline is working on a second instruction. Thus, more than one instruction can be processed at a time potentially increasing the effective rate at which instructions are processed.
Some pipelines process instructions “speculatively.” Speculative execution means that instructions are fetched and executed before resolving pertinent control and/or data dependencies. Speculative execution predicts how data and/or control dependencies will be resolved, executes instructions based on the predictions, and then verifies that the predictions were correct before retiring the instruction and results therefrom.
The verification step can be a challenge to pipeline design. At the end of the pipeline, the results from executed instructions are temporarily stored in a register until all data and control dependencies have been actually resolved. The pipeline then checks whether any mispredictions or other problems occurred, i.e., both generally referred to as exceptions. In the absence of execution problems, the executed instructions are “retired” and results are stored to architectural registers, an operation referred to as “commitment to an architectural state.” If execution problems occur, the processor performs a correction routine.
Execution problems are problems that can result in:
(1) executing an instruction that should not have been executed;
(2) not executing an instruction that should have been executed; or
(3) executing an instruction with incorrect data. To process the instruction stream correctly, the effects of execution problems on subsequent execution of instructions must also be corrected.
Many prior art pipelined processors “stall” the pipeline upon detecting an exception. In stallable instruction pipelines, a number of latches or registers govern progress through the stages of the pipeline. A pipeline controller generates a signal to enable or disable the latches or registers. During a stall, the latches or registers are disabled so that the instructions are not transferred to the next stage. After an exception that caused the stall and its effects are repaired, the pipeline controller re-enables the latches or registers and transfers between pipeline stages resume.
To operate a stallable pipeline, the pipeline controller needs to receive status signals from the stages of the pipeline, determine whether to stall from the received signals, and then broadcast a signal to stall or proceed. Since each of these steps takes time, implementing the ability to stall may limit the operating frequency of the pipeline.
Some processor pipelines “replay” in addition to stalling. Replay is the re-execution of instructions upon detecting an exception. If an exception is detected, speculative results are ignored, e.g., the architectural state is not updated and instructions are not retired. The processor corrects the problem and re-executes the instructions.
One processor employing replay is the Alpha 21164 microprocessor, commercially available from Digital Equipment Corporation. The Alpha 21164 stalls only the first three stages of the pipeline. If a problem occurs after the third stage, the Alpha 21164 replays the entire pipeline after the repairing problem. The Alpha 21164 therefore combines expensive stalling with complex decision-making circuitry necessary to determine when to replay. The Alpha 21164 replays the entire pipeline line even though the problem may be localized. Replaying the entire pipeline may be inefficient if there are several parallel execution units, e.g., a superscalar processor, and the problem was localized to one of the parallel execution units.
The demand for faster processors continually outstrips present technology. The demand pressures all aspects of processor architecture to become faster in the sense of higher instruction throughput. Current techniques for handling exceptions in pipelines processing can substantially reduce instruction throughput.
The present invention is directed to overcoming, or at least reducing the effects of, one or more of the problems set forth above.
SUMMARY OF THE INVENTION
The invention, in one embodiment, provides a method for executing instructions. The method includes dispatching and executing a first and second plurality of instructions in a portion of a pipeline without first determining whether stages of the portion of the pipeline are ready. The method further includes determining if an execution problem is encountered and replaying the first plurality of instructions in response to determining that the first plurality of instructions encountered an execution problem.
The invention in another embodiment, provides a processor pipeline. The processor pipeline includes a front end to fetch a plurality of instructions for execution and a back end to execute the plurality of instructions fetched by the front end. The back end includes a retirement stage to determine if an instruction had an execution problem. The back end is non-stallable. The processor pipeline also includes a channel to send an indication that the instruction encountered an execution problem from the retirement stage to a replay point of the pipeline from which the instruction may be re-executed.
REFERENCES:
patent: 4920477 (1990-04-01), Colwell
patent: 5012403 (1991-04-01), Keller et al.
patent: 5297263 (1994-03-01), Ohtsuka
patent: 5307477 (1994-04-01), Taylor
patent: 5421022 (1995-05-01), McKeen et al.
patent: 5428807 (1995-06-01), McKeen et al.
patent: 5584037 (1996-12-01), Papworth et al.
patent: 5584038 (1996-12-01), Papworth et al.
patent: 5659721 (1997-08-01), Shen et al.
patent: 5751985 (1998
Grochowski Edward T.
Lin Derrick C.
Blakely , Sokoloff, Taylor & Zafman LLP
Coleman Eric
Intel Corporation
LandOfFree
Processor pipeline including replay does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Processor pipeline including replay, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Processor pipeline including replay will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2466456