Electrical computers and digital processing systems: processing – Instruction issuing – Simultaneous issuance of multiple instructions
Reexamination Certificate
2000-07-07
2001-07-24
Coleman, Eric (Department: 2183)
Electrical computers and digital processing systems: processing
Instruction issuing
Simultaneous issuance of multiple instructions
C712S200000, C712S023000
Reexamination Certificate
active
06266765
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to digital processors and, more particularly, to the instruction issuing and execution units of a digital processor.
2. Description of the Relevant Art
A primary goal in the design of digital processors is to increase the throughput, i.e., the number of instructions processed per unit time, of the processor. One approach has been to improve the hardware design of the processor to reduce the machine cycle time. Another approach has been to develop architectures and instruction sets designed to process one instruction per machine cycle. Both,of these approaches are limited to a theoretical maximum throughput of one instruction per machine cycle due to basic policy of sequentially issuing at most one instruction per cycle.
Systems for issuing more than one instruction per cycle are described in a paper by Ditzel et al. entitled “The Hardware Architecture of the CRISP Microprocessor”, 1098 ACM 0084-7495 87, pp. 309-319 and in a paper by Acosta et al. entitled “An instruction issuing Approach to Enhancing Performance in Multiple Functional Unit Processors”, IEEE Transactions on Computers, Vol. C-35, No. 9, September 86, pp. 815-828.
One limitation on concurrent issuing of instructions is that the instructions must not require the use of the same functional unit of the processor during the same machine cycle. This limitation is related to the resources included in the processor architecture and can be somewhat obviated by providing additional copies of heavily used functional units.
The paper by Acosta et al. presents an approach to concurrently issuing instructions to take advantage of the existence of multiple functional units. Further, the CRISP architecture, described in the above-referenced paper, allows the execution of a branch instruction concurrently with another instruction. Additionally, mainframes have allowed concurrent dispatching of integer and floating point instructions to different functional units.
However, all of these systems require that the instructions issued concurrently not be dependent on each other. Types of dependencies will be discussed fully below, but a fundamental dependency between a pair of instructions is that the second instruction in the pair processes data resulting from the execution of the first instruction in the pair. Accordingly, the first instruction must be processed prior to the second.
Thus, these existing processors may concurrently issue and execute very few combinations of instructions. A branch instruction is a special case where no memory reference is required and requires only that a new address be calculated. Similarly, floating point and integer instructions require only ALU resources and no memory reference. Thus, data dependencies between the instructions do not exist.
In view of the above limitations, the type of instructions that may be concurrently issued in these systems is extremely limited and, although in certain limited situations two instructions may be issued in one clock, the average throughput cannot significantly exceed one clock per instruction.
SUMMARY OF THE INVENTION
In the present invention, a family of instructions is a set of sequential instructions in a program that may be issued concurrently in one clock. The number of types of instructions that may be included in a family is greater than allowed in prior art processors.
In the present invention, a family of instructions that includes, for instance, instructions of the ALU and memory reference type may be issued during a single clock. A special pipeline includes resources that facilitate the acceptance and processing of the issued family. Thus, the invention provides for an instruction processing throughput of greater than one instruction per clock.
According to one aspect of the invention, a family of instructions is fetched and decoded. The decode result for each instruction includes status information indicating which resources are required to execute the instruction. The family of instructions is issued in one clock if the status information indicates that no resource conflicts will occur during execution.
According to a further aspect of the invention, an execution unit executes a family of instructions having data dependencies by providing resulting data of a first instruction required as an operand of a second instruction prior to writing the resulting data to a register.
According to a still further aspect of the invention, a subset of the instructions of a selected instruction set are designated as candidates for concurrent execution. The status information in the decode results of each instruction in the family indicates whether the instruction is a candidate for concurrent execution. If the status information indicates that all the instructions in the family are candidates and that no resource conflicts will occur then the family is executed concurrently.
According to a further aspect of the invention, a unique exception handling procedure allows exception procedures developed for single instructions to be utilized thus simplifying the system. The system tests for the presence of an exception during the execution of a family. If an exception is detected then the data write associated with the family is inhibited to preserve the macrostate of the system. The instructions in the family are then issued singly so that the existing exception handling procedure may be utilized.
According to another aspect of the invention, a branch recovery mechanism for recovering from a branch misprediction tests for a misprediction by comparing the branch prediction bit and the branch condition bit. In the event of a misprediction, the mechanism differs depending on position of the branch instruction within the family. If the branch instruction is the last instruction in the family, then the pipeline is flushed and the correct next instruction is fetched into the pipeline. If the branch instruction is not the last instruction in the family, then the data writes associated with all instructions in the family following the branch must be inhibited, then the pipeline is flushed and the correct next instruction is fetched into the pipeline.
REFERENCES:
patent: 3881173 (1975-04-01), Larsen
patent: 4295193 (1981-10-01), Pomerene
patent: 4476525 (1984-10-01), Ishii
patent: 4926323 (1990-05-01), Baror
patent: 4991080 (1991-02-01), Emma
patent: 5101341 (1992-03-01), Circello
Coleman Eric
Oppenheimer Wolff & Donnelly LLP
Sherry Leah
LandOfFree
Computer architecture capable of execution of general... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Computer architecture capable of execution of general..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Computer architecture capable of execution of general... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2507479