Electrical computers and digital processing systems: processing – Dynamic instruction dependency checking – monitoring or... – Scoreboarding – reservation station – or aliasing
Reexamination Certificate
1998-06-16
2001-01-16
Donaghue, Larry D. (Department: 2783)
Electrical computers and digital processing systems: processing
Dynamic instruction dependency checking, monitoring or...
Scoreboarding, reservation station, or aliasing
Reexamination Certificate
active
06175910
ABSTRACT:
TECHNICAL FIELD
The present invention generally relates to a microprocessor architecture for optimizing execution of instructions using speculative operations in Very Long Instruction Word (VLIW) processors having multiple Arithmetic Logic Units (ALUs) and more particularly to a system and method for using standard registers as shadow registers.
BACKGROUND ART
DEFINITIONS
The following definitions will be helpful to the reader for understanding the description:
ALU:
An ALU (Arithmetic and Logic Unit) is a logic mechanism capable of computing arithmetic (addition, subtraction, etc . . . ) and logic operations (and, or, etc . . . ).
Operation:
An operation is the tiniest executable command a processor is capable of handling. (example: ALU operation, Memory operation, etc . . . ).
Instruction:
An instruction is the content of one word of the Instruction Store. It may be composed of one or multiple operations. For example a VLIW (Very Long Instruction Word) instruction specifies more than one concurrent operations.
ALU Operation:
An ALU operation is an operation involving an Arithmetic and Logic Unit. ALU operations are arithmetic (addition, subtraction, . . . ) or logic (and, or, . . . ) operations.
IO Operation:
An IO (input/Output) operation is an operation capable of accessing an external device using read and write operations (example: load I
0
, store I
0
, etc . . . )
Memory Operation:
A memory operation is an operation capable of accessing an external memory using read and write operations (example: load local memory, store local memory, etc . . . )
Branch:
An instruction may point to the next instruction address (for example
103
to
104
), or to any other instruction address (for. example
103
to
221
). The fact of going to an instruction address different from the next one is called a branch.
Basic Block:
A basic block is a set of instructions placed between two consecutive branch operations.
Branch Operation:
A branch operation is an operation capable or routing the program to a new address distinct from the next address (present address plus one).
Multiport Array:
A multiport array is a set of registers where multiple registers can be written or read at the same time.
Ports:
Each set of wires necessary to perform a read or write operation is called a port. For example, a multiport array can have two write ports, and four read ports. Two distinct data in two distinct addresses can be written and four distinct data from four distinct addresses can be read at the same time.
Processor Cycle:
A processor is composed of a set of logic elements, all timed (allowed to change value) at discrete instants. These instants are periodical, and the period is called processor cycle.
Source Registers:
The input data of an instruction are in registers called source registers.
Target Register:
The result of an instruction is assigned to a register called target register.
Shadow register:
As it is not known whether speculative operations must take place or not, results of these operations are assigned to registers different from target registers. These registers are called shadow registers.
SUPERSCALAR PROCESSORS
Most up to date processors belong to the category of superscalar RISC/CISC, (Reduced Instruction Set Computer/Complex Instruction Set Computer) processors. These processors comprise multiple internal processing units with a hardware mechanism for dispatching multiple instructions to said multiple processing units, each instruction comprising a single operation. The dispatching mechanism allows the execution of multiple instructions in a single processor cycle using a queue of instructions, with a mechanism capable of searching within the queue the instructions capable of being executed at the same time while they are originally in a given sequence.
VLIW PROCESSORS
Very Long Instruction Word (VLIW) processors constitute another category of processors where each instruction allows the execution of multiple operations, each operation corresponding to an internal processing unit. VLIW processors are simpler than superscalar processors in that the feeding of the multiple executions units is already done at the instruction level.
The basic principles of VLIW processors are described in a publication entitled “Super Scalar Microprocessor Design” from Mike Johnson (Prentice Hall Series in Innovative Technology 1991 p.25). In VLIW processors, a single instruction specifies more than one concurrent operation. In comparison to scalar processors, because a single VLIW instruction can specify multiple operations (in lieu of multiple scalar instructions), VLIW processors are capable of reducing the number of instructions for a program. However, in order for the VLIW processor to sustain an average number of cycles per instruction comparable to the rate of a scalar processor, the operations specified by a VLIW instruction must be independent from one another. Otherwise, the VLIW instruction is similar to a sequential, multiple operation CISC (Complex Instruction Set Computer) instruction, and the number of cycles per instruction goes up accordingly. As the name implies, the instruction of a VLIW processor is normally quite large, taking many bits to encode multiple operations. VLIW processors rely on software to pack the collection of operations representing a program into instructions. To accomplish this, software uses a technique called compaction. The more densely the operations can be compacted (that is, the fewer the number of instructions used for a given set of operations), the better is the performance, and the better is the encoding efficiency. During compaction, null operation fields are used for instruction that cannot be used. In essence, compaction serves as a limited form of out of order issue, because operations can be placed into instructions in many different orders. To compact instructions, software must be able to detect independent operations, and this can restrict the processor architecture, the application, or both.
SIMULTANEOUS EXECUTION OF INSTRUCTIONS
In the search for high performance, both superscalar and VLIW processors try lo split the code in “basic blocks”. These basic blocks are sets of instructions placed between two consecutive branch operations and which present no data dependency, or resources conflict. These basic blocks allow simultaneous execution of all instructions within the block, and can be packed in a smaller number of VLIW instructions. Present examples of code running in real time VLIW processors show that the size of basic blocks may be small. This leads to unused operations within the instructions used to perform the basic blocks. However, these empty operations may be filled with operations coming from other basic blocks.
SPECULATIVE INSTRUCTIONS
In an ever going search for performance, it has been made possible to use the empty fields left in VLIW instructions for performing operations belonging to other basic blocks. These operations displaced from one block to another, and executed while not knowing whether they should be performed are called speculative operations.
FIG. 1
describes an instruction in a VLIW processor capable of performing simultaneously three operations:
1. an ALU operation (
115
), with data in source registers R1 (
102
) and R2 (
103
), result written into target register R2, with a speculative flag S (
111
), and an identification of the source of the instruction B (
112
) (“Y” (YES) or “N” (NO) side of the next branch).
2. an ALU or memory operation (
116
), with data in source registers R3 (
107
) arid R4 (
108
), result written in target register R4, with a speculative operation flag S (
113
), and an identification of the source of the instruction B (
114
) (“Y” (YES) or “N” (NO) side of the next branch).
3. a branch operation (
100
).
The instruction described in
FIG. 1
comprises the following fields:
A: an ALU operation field (
101
);
R1,R2,R3,R4: register fields (
102
,
103
,
107
,
108
);
S: speculative operation flags (
111
and
113
);
S=1 speculative operation
S=0 non speculative (normal) o
Jacob Francois
Pauporte Andre
Donaghue Larry D.
International Business Machines Corportion
Scully Scott Murphy & Presser
Underweiser, Esq. Marian
LandOfFree
Speculative instructions exection in VLIW processors does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Speculative instructions exection in VLIW processors, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Speculative instructions exection in VLIW processors will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2510694