Data processor for the parallel processing of a plurality of...

Electrical computers and digital processing systems: processing – Instruction decoding – Predecoding of instruction component

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C712S215000, C712S223000

Reexamination Certificate

active

06256726

ABSTRACT:

BACKGROUND OF THE INVENTION
This invention relates to CPUs, such as in minicomputers or microcomputers, and particularly to a data processor suitable for use in high speed operation.
Hitherto, various means have been devised for the high speed operation of computers. The typical one is a pipeline system. The pipeline system does not complete the processing of one instruction before execution of the next instruction is started, but performs the execution of instructions in a bucket-relay manner such that, when the execution of one instruction which is divided into a plurality of stages is going to enter into the second stage, execution of the first stage of the next instruction, which is similarly divided into a plurality of stages, is started. This system is described in detail in the book “ON THE PARALLEL COMPUTER STRUCTURE”, written by Shingi Tomita, published by Shokodo, pages 25 to 68. By use of the n-stage pipeline system, it is possible to execute n instructions along all stages at the same time and complete the processing of one instruction at each pipeline pitch with one instruction being processed at each pipeline stage.
It is well known that the instruction architecture of a computer has a large effect on the processing operation and the process performance. From the instruction architecture point of view, the computer can be grouped into the categories CISC (Complex Instruction Set Computer) and RISC (Reduced Instruction Set Computer). The CISC processes complicated instructions by use of microinstructions, while the RISC treats simple instructions, and instead performs high speed computation using hard wired logic control without use of microinstructions. Now, we will describe the summary of the hardware and the pipeline operation of both the conventional CISC and RISC.
FIG. 2
shows the general construction of the CISC-type computer. There are shown a memory interface
200
, a program counter (PC)
201
, an instruction cache
202
, an instruction register
203
, an instruction decoder
204
, an address calculation control circuit
205
, a control storage (CS)
206
in which microinstructions are stored, a microprogram counter (MPC)
207
, a microinstruction register
208
, a decoder
209
, a register MDR (Memory Data Register)
210
which exchanges data with the memory, a register MAR (Memory Address Register)
211
which indicates the operand address in the memory, an address adder
212
, a register file
213
, and an ALU (Arithmetic Logical Unit)
214
.
The operation of the computer will be mentioned briefly. The instruction indicated by the PC
201
is taken out by the instruction cache and supplied through a signal
217
to the instruction register
203
where it is set. The instruction decoder
204
receives the instruction through a signal
218
and sets the head address of the microinstruction through a signal
220
in the microiprogram counter
207
. The address calculation control circuit
205
is ordered through a signal
219
to process the way to calculate the address. The address calculation control circuit
205
reads the register necessary for the address calculation, and controls the address adder
212
. The contents of the register necessary for the address calculation are supplied from the register file
213
through buses
226
,
227
to the addres adder
212
. On the other hand, a microinstruction is read from the CS
206
at every machine cycle, and is decoded by the decoder
209
and used to control the ALU
214
and the register file
213
. In this case, a control signal
224
is supplied thereto. The ALU
214
calculates data fed from the register through buses
228
,
229
, and again stores it in the register file
213
through a bus
230
. The memory interface
200
is the circuit used for exchanging data with the memory such as fetching of instructions and operands.
The pipeline operation of the computer shown in
FIG. 2
will be described with reference to
FIGS. 3
,
4
and
5
. The pipeline is formed of six stages. At the IF (Instruction Fetch) stage, an instruction is read by the instruction cache
202
and set in the instruction register
203
. At the D (Decode) stage, the instruction decoder
204
performs decoding of the instruction. At the A (Address) stage, the address adder
212
carries out the calculation of the address of the operand. At the OF (Operand Fetch) stage, the operand of the address pointed to by the MAR
211
is fetched through the memory interface
200
and set in the MDR
210
. At the EX (Execution) stage, data is read by the register file
213
and the MDR
210
, and fed to the ALU
214
where it is calculated. At the last W (Write) stage, the calculation result is stored through the bus
230
in one register of the register file
213
.
FIG. 3
shows the continuous processing of add instruction ADDs as one basic instruction. At each machine cycle, one instruction is processed, and the ALU
214
and address adder
212
operate in parallel.
FIG. 4
shows the processing of the conditional branch instruction BRAcc. A flag is produced by the TEST instruction.
FIG. 4
shows the flow at the time when the condition is met. Since the flag is produced at the EX stage, three-cycles of waiting times are necessary until the jumped-to-instruction is fetched and the greater the number of stages, the greater will be the waiting cycle count, resulting in a bottleneck in the performance enhancement.
FIG. 5
shows the execution flow of a complicated instruction. The instruction
1
is the complicated instruction. The complicated instruction requires a great number of memory accesses as in the string copy and is normally processed by extending the EX stage many times. The EX stage is controlled by the microprogram. The microprogram is accessed once per machine cycle. In other words, the complicated instruction is processed by reading the microprogram a plurality of-times. At this time, since one instruction is processed at the EX stage, the next instruction (the instruction
2
shown in
FIG. 5
) is required to wait. In such case, the ALU
214
operates at all times, and the address adder
212
idles.
The RISC-type computer will hereinafter be described.
FIG. 6
shows the general construction of the RISC-type computer. There are shown a memory interface
601
, a program counter
602
, an instruction cache
603
, a sequencer
604
, an instruction register
605
, a decoder
606
, a register file
607
, an ALU
608
, an MDR
609
, and an MAR
610
.
FIG. 7
shows the process flow for the basic instructions. At the IF (Instruction Fetch) stage, the instruction pointed to by the program counter
602
is read by the instruction cache and set in the instruction register
605
. The sequencer
604
controls the program counter
602
in response to an instruction signal
615
and a flag signal
616
from the ALU
608
. At the R (Read) stage, the contents of the instruction pointer register is transferred through buses
618
,
619
to the ALU
608
. At the E (Execution) stage, the ALU
608
performs an arithmetic operation. Finally at the W (Write) stage, the calculated result is stored in the register file
607
through a bus
620
.
In the RISC-type computer, the instruction is limited only to the basic instruction. The arithmetic operation is made only between the registers, and the instruction including operand-fetch is limited to the load instruction and the store instruction. The complicated instruction can be realized by a combination of basic instructions. Without use of the microinstruction, the contents of the instruction register
605
are decoded directly by the decoder
606
and used to control the ALU
608
and so on.
FIG. 7
shows the process flow for a register-to-register arithmetic operation. The pipeline is formed of four stages since the instruction is simple.
FIG. 8
shows the process flow at the time of a conditional branch. As compared with the CISC-type computer, the number of pipeline stages is small, and thus the waiting cycle time is only one cycle. In this case, in addition to the inter-register operation, it is

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Data processor for the parallel processing of a plurality of... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Data processor for the parallel processing of a plurality of..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Data processor for the parallel processing of a plurality of... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2446348

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.