Determining successful completion of an instruction by...

Electrical computers and digital processing systems: processing – Dynamic instruction dependency checking – monitoring or... – Reducing an impact of a stall or pipeline bubble

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C712S244000

Reexamination Certificate

active

06658555

ABSTRACT:

BACKGROUND
1. Field of the Present Invention
The present invention generally relates to the field of microprocessors and more particularly to a microprocessor utilizing a non-stalling execution pipeline for improved performance.
2. History of Related Art
The use of pipelined architectures in the design of microprocessor systems is well known. Pipelining improves performance by overlapping the execution of multiple instructions. In a pipelined microprocessor, the execution of each instruction occurs in stages, where each stage ideally completes in one clock cycle. Additional information concerning pipelining is available in Hennessy & Patterson,
Computer Architecture a Quantitative Approach
, pp. 125-214 (Morgan Kaufinann 2d ed. 1996). Turning to
FIG. 3
, a simplified representation of an execution pipeline
300
in a conventional processor is presented. Pipeline
300
includes a set of latches or registers
302
a
,
302
b
, etc. (collectively or generically referred to herein as latches
302
). Each latch
302
represents the termination of one pipeline stage and the beginning of another. In
FIG. 3
, pipeline
300
is full such that each latch
302
contains information corresponding to an instruction that is proceeding through the pipeline. Each stage of pipeline
300
includes a functional logic block, represented in
FIG. 3
by reference numerals
304
a
,
304
b
, etc., that defines the operation of the corresponding pipeline stage.
If an instruction flowing through a pipeline
300
generates an exception at any stage, the pipeline must be stalled so that instructions in the pipeline do not collide.
FIG. 3
indicates a stall condition signal
306
generated by logic block
304
a
. Stall condition signal
306
indicates that logic block
304
a
is unable to successfully complete its assigned function with respect to the current instruction (Instruction A) within the single cycle timing constraint. Because Instruction A did not complete successfully, it is necessary to retain Instruction A in latch
302
a
for at least one more cycle. In addition, it is also necessary to route stall signal
306
to preceding pipeline stages so that instructions corresponding to each of the preceding stages are not advanced in pipeline
300
.
In a conventionally designed pipeline such as pipeline
300
, an instruction is stalled by feeding the output of each latch
302
back to the latch's input. These feedback loops are indicated in
FIG. 3
by reference numerals
308
a
,
308
b
, etc. Accordingly, each latch
302
can receive its input from a one of two sources, namely, the output of the preceding stage or the output the latch itself. In a typical configuration, this dual input feature is accommodated using a multiplexer corresponding to each bit of a latch
302
as depicted in FIG.
4
.
FIG. 4
illustrates the output of a bit
310
of a latch
302
being routed back to one of the inputs of a multiplexer
312
k
. The other input to multiplexer
312
k
is received from the output of a preceding stage in pipeline
300
. The stall signal
306
serves as the select input to mux
312
k
. It will be appreciated the structure of
FIG. 4
is repeated for each bit position in latch
302
and that the number of multiplexers
310
that stall signal
306
is required to drive increases with the number of bits in latch
302
. In addition, stall signal
306
must be routed to preceding stages to stall instructions in preceding latches. This routing may require signal
306
to travel a considerable distance over an interconnect with an associated capacitive loading. The combination of the number of multiplexers
312
k
being driven by signal
306
and the distance that signal
306
must travel limit the minimum time required for stall signal
306
to stall pipeline
300
. For processors with wide pipelines (i.e., 64 bits or more), operating a high frequencies (i.e., frequencies in excess of 1 GHz) stall signal
306
may be unable to successfully halt the pipeline in a single cycle. Therefore, it would be desirable to implement a processor with a wide execution pipeline capable of high speed execution free from the constraints imposed by the need to accommodate pipeline stalls.
SUMMARY OF THE INVENTION
The problem identified above is addressed by a microprocessor and related method and data processing system are disclosed. The microprocessor includes a dispatch unit suitable for issuing an instruction executable by the microprocessor, an execution pipeline configured to receive the issued instruction, and a pending instruction unit. The pending instruction unit includes a set of pending instruction entries. A copy of the issued instruction is maintained in one of the set of pending instruction entries. The execution pipeline is adapted to record, in response detecting to a condition preventing the instruction from successfully completing one of the stages in the pipeline during a current cycle, an exception status with the copy of the instruction in the pending instruction unit and to advance the instruction to a next stage in the pipeline in the next cycle thereby preventing the condition from stalling the pipeline. Preferably, the dispatch unit, in response to the instruction finishing pipeline execution with an exception status, is adapted to use the copy of the instruction to re-issue the instruction to the execution pipeline in a subsequent cycle. In one embodiment, the dispatch unit is adapted to deallocate the copy of the instruction in the pending instruction unit in response to the instruction successfully completing pipeline execution. The pending instruction unit may detect successful completion of the instruction by detecting when the instruction has been pending for a predetermined number of cycles without recording an exception status. In this embodiment, each entry in the pending instruction unit may include a timer field comprising a set of bits wherein the number of bits in the time field equals the predetermined number of cycles. The pending instruction unit may set, in successive cycles, successive bits in the timer field such that successful completion of an instruction is indicated when a last bit in the time field is set. In one embodiment, pending instruction unit includes a set of copies of instructions corresponding to each of a set of instructions pending in the execution pipeline at any given time. In various embodiments, the execution pipeline may comprise a load/store pipeline, a floating point pipeline, or a fixed point pipeline.


REFERENCES:
patent: 5692169 (1997-11-01), Kathail et al.
patent: 5748936 (1998-05-01), Karp et al.
patent: 5799179 (1998-08-01), Ebcioglu et al.
patent: 5881280 (1999-03-01), Gupta et al.
patent: 6163839 (2000-12-01), Janik et al.
Chang, Andrew, et al., The Effects of Explicitly Parallel Mechanisms on the Multi-ALU Processor CLuster Pipeline, Feb. 1998, IEE Publication, pp 474-481.*
Popescu et al., “The Metaflow Architecture,” 1991, IEEE Micro, pp. 10-13 and 63-73.*
August et al., “Integrated Predicated and Speculative Execution in the IMPACT EPIC Architecture,” 1998, The 25th Annual International Symposium on Computer Architecture.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Determining successful completion of an instruction by... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Determining successful completion of an instruction by..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Determining successful completion of an instruction by... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3184469

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.