Pipelined instruction dispatch unit in a superscalar processor

Electrical computers and digital processing systems: processing – Instruction issuing – Simultaneous issuance of multiple instructions

Reissue Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C712S209000, C712S217000, C712S213000, C714S039000

Reissue Patent

active

RE038599

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to computer architecture. In particular, this invention relates to the design of an instruction unit in a superscalar processor.
2. Discussion of the Related Art
Parallelism is extensively exploited in modern computer designs. Among these designs are two distinct architectures which are known respectively as the very long instruction word (VLIW) architecture and the superscalar architecture. A superscalar processor is a computer which can dispatch one, two or more instructions simultaneously. Such a processor typically includes multiple functional units which can independently execute the dispatched instructions. In such a processor, a control logic circuit, which has come to be known as the “grouping logic” circuit, determines the instructions to dispatch (the “instruction group”), according to certain resource allocation and data dependency constraints. The task of the computer designer is to provide a grouping logic circuit which can dynamically evaluate such constraints to dispatch instruction groups which optimally use the available resources. A resource allocation constraint can be, for instance, in a computer with a single floating point multiplier unit, the constraint that no more than one floating point multiply instruction is to be dispatched for any given processor cycle. A processor cycle is the basic timing unit for a pipelined unit of the processor, typically the clock period of the CPU clock. An example of a data dependency constraint is the avoidance of a “read-after-write” hazard. This constraint prevents dispatching an instruction which requires an operand from a register which is the destination of an write instruction dispatched earlier, but yet to be unretired.
A VLIW processor, unlike a superscalar processor, does not dynamically allocate system resources at run time. Rather, resource allocation and data dependency analysis are performed during program compilation. A VLIW processor decodes the long instruction word to provide the control information for operating the various independent functional units. The task of the compiler is to optimize performance of a program by generating a sequence of such instructions which, when decoded, efficiently exploit the program's inherent parallelism in the computer's parallel hardware. The hardware is given little control of instruction sequencing and dispatch.
A VLIW computer, however, has a significant drawback in that its programs must be recompiled for each machine they run on. Such recompilation is required because the control information required by each machine is encoded in the instruction words. A superscalar computer, by contrast, is often designed to be able to run existing executable programs (i.e., “binaries”). In a superscalar computer, the instructions of an existing executable program are dispatched by the computer at run time according to the computer's particular resource availability and data integrity requirements. From a computer user's point of view, because existing binaries represent significant investments, the ability to acquire enhanced performance without the expense of purchasing new copies of binaries is a significant advantage.
In the prior art, to determine the instructions that go into an instruction group of a given processor cycle, a superscalar computer performs the resource allocation and data dependency checking tasks in the immediately preceding processor cycle. Under this scheme, the computer designer must ensure that such resource allocation and data dependency checking tasks complete within their processor cycle. As the number of the functional units that can be independently run increases, the time required for performing such resource allocation and data dependency checking tasks grows more rapidly than linearly. Consequently, in a superscalar computer design, the ability to perform resource and data integrity analysis within a single processor cycle can become a factor that limits the performance gain of additional parallelism.
SUMMARY OF THE INVENTION
The present invention provides a central processing unit which includes a grouping logic circuit for determining simultaneously dispatchable instructions in an processor cycle. The central processing unit of the present invention includes such a grouping logic circuit and a number of functional units, each adapted to execute one or more specified instructions dispatched by the grouping logic circuit. The grouping logic circuit includes a number of pipeline stages, such that resource allocation and data dependency checks can be performed over a number of processor cycles. The present invention therefore allows dispatching a large number of instruction simultaneously, while avoiding the complexity of the grouping logic circuit from becoming limiting the duration of the central processing unit's processor cycle.
In one embodiment, the grouping logic circuit checks intra-group data dependency immediately upon receiving the instruction group. In that embodiment, all instruction in a group of instructions received in a first processor cycle are dispatched prior to dispatching any instruction of a second group of instructions received at an processor cycle subsequent to said first processor cycle.
The present invention is better understood upon consideration of the detailed description below in conjunction with the accompanying drawings.


REFERENCES:
patent: 5127093 (1992-06-01), Moore, Jr.
patent: 5497499 (1996-03-01), Garg et al.
patent: 5560028 (1996-09-01), Sachs et al.
patent: 5594864 (1997-01-01), Trauben
patent: 5627984 (1997-05-01), Gupta et al.
patent: 5958042 (1999-09-01), Tremblay
patent: 0 651 323 (1995-05-01), None
D. Sweetman: “Superscalar or superpipelined or just hype?” New Electronics, vol. 24, No. 11, Dec. 1, 1991, pp. 16-18, XP00310175.*
V. Popescu et al.: “The Metaflow Architecture” IEEE Micro, vol. 11, No. 3, Jun. 1, 1991, pp. 10-12, 63-73, XP000237231.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Pipelined instruction dispatch unit in a superscalar processor does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Pipelined instruction dispatch unit in a superscalar processor, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Pipelined instruction dispatch unit in a superscalar processor will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3221634

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.