Processor for making more efficient use of idling components...

Electrical computers and digital processing systems: processing – Instruction issuing – Simultaneous issuance of multiple instructions

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C712S210000, C712S212000

Reexamination Certificate

active

06360312

ABSTRACT:

This application is based on an application No. 10-083369 filed in Japan, the content of which is hereby incorporated by reference.
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a processor that executes a plurality of instructions in parallel and to a program conversion apparatus for the same.
2. Description of the Related Art
In recent years, VLIW (Very Long Instruction Word) processors have been developed with the aim of achieving high-speed processing. These processors use long-word instructions composed of a plurality of instructions to execute a number of instructions in parallel.
Japanese Laid-Open Patent No.
5
-
11979
discloses an example of this kind of technique.
FIG. 1
is a block diagram of a processor disclosed in this document.
The processor of
FIG. 1
includes a register file
1
, an external memory
2
, an instruction register
3
having four instruction slots, an input switching circuit
4
, a transfer unit
5
, a integer calculation unit
6
, a transfer unit
7
, an integer calculation unit
8
, an integer calculation unit
9
, a floating-point unit
10
, a branch unit
11
, an output switching circuit
12
and a register file or external memory
13
.
The instruction register
3
stores four instructions, which make up one long-word instruction, in its four internal instruction slots (hereafter referred to as ‘slots’). Here, the instruction in each of the first and second slots is either an integer calculating instruction or a data transfer instruction (also referred to as a load/store instruction). The instruction in the third slot is a floating-point calculating instruction or an integer calculating instruction and that in the fourth slot is a branch instruction. The arrangement of instructions in one long-word instruction is performed in advance by a compiler.
The transfer unit
5
and the integer calculation unit
6
are aligned with the first slot, and execute the data transfer and integer calculating instructions respectively.
The transfer unit
7
and the integer calculation unit S are aligned with the second slot, and execute the data transfer and integer calculating instructions respectively.
The integer calculation unit
9
and the floating-point unit
10
are aligned with the third slot, and execute the integer calculation and floating-point instructions respectively.
The branch unit
11
is aligned with the fourth slot and executes branch instructions.
Here, the transfer units
5
and
7
, the integer calculation units
6
,
8
and
9
, the floating-point unit
10
and the branch unit
11
are generally referred to as functional units.
The input switching circuit
4
inputs source data read from the register file
1
or the external memory
2
into the required functional units.
The output switching circuit
12
outputs the results of calculations by the utilized functional units to the register file or external memory
13
.
A processor constructed as above decodes and executes instructions stored in the four slots in parallel. Assume, for example, that an ‘add’ instruction for adding register data is stored in the first slot. The processor inputs two pieces of register data from the register file
1
into the integer calculation unit
6
via the input switching circuit
4
. The two pieces of register data are then added by the integer calculation unit
6
and the result stored in the register file
13
via the output switching circuit
12
. Instructions in the second, third and fourth slots are also decoded and executed in parallel with this instruction.
However, in this kind of conventional processor certain functional units are left idling when instructions are executed. When an integer calculating instruction is executed by the third slot, for example, the floating-point unit is left idling.
SUMMARY OF THE INVENTION
An object of the present invention is to provide a processor that utilizes idling functional units, thus improving processing performance.
A second object is to provide a processor that executes at a high speed the product-sum operations frequently used in current multimedia processing.
A processor that achieves the above objects includes first and second decoding units, first and second executing units corresponding to the first and second decoding units, and a selecting unit. The first and second executing units decode instructions and generate results denoting their content. It the first decoding unit decodes a special instruction, it generates first-part and second-part decode results denoting a first-type calculation and a second-type calculation. The executing units execute instructions in parallel according to a decode result from the corresponding decoding unit. If the first decoding unit decodes the special instruction, the selecting unit selects the second-part decode result, and if the first decoding unit decodes an instruction other than the special instruction, the selecting unit selects the decode result from the second decoding unit.
The second executing unit includes a first functional unit, which executes instructions according to the decode result selected by the selecting unit, and a second functional unit, which executes instructions according to the decode result of the second decoding unit. If the special instruction is decoded, the first executing unit performs a first-type calculation, the first functional unit performs a second-type calculation and the second functional unit executes an instruction decoded by the second decoding unit.
Here, the special instruction may include an operation code denoting the first-type calculation and the second-type calculation, and first and second operands. The first executing unit performs the first-type calculation on the first and second operands, and stores a calculation result in the first operand. Meanwhile, the second executing unit performs the second-type calculation on the first and second operands, and stores a calculation result in the second operand.
This structure enables a first-type calculation and a second-type calculation to be executed by the first and second executing units according to a special instruction in one instruction slot. This allows idling functional units to be used, thus increasing processing performance.
Here, the first executing unit may include an adder/subtracter, the first functional unit be an adder/subtracter and the special instruction denote addition as the first-type calculation and subtraction as the second-type calculation.
This structure enables an instruction other than the special instruction to be executed in parallel with the addition and subtraction denoted by the special instruction, so that the processing performance of the processor can be further increased.
Here, the second functional unit is a multiplier and the instruction is a multiply instruction
This structure enables addition, subtraction and multiplication to be executed in parallel, so that product-sum calculations extensively used in modern multimedia processing can be executed efficiently.
Furthermore, a program conversion apparatus that achieves the above objects is one that changes a source program to an object program for a target processor executing long-word instructions. This program conversion apparatus includes a retrieving unit, a generating unit and an arranging unit. The retrieving unit retrieves a pair of instructions denoting a first-type calculation of two variables and a second-type calculation of the same two variables from a source program. The generating unit generates a special instruction corresponding to the retrieved pair. This special instruction includes an operation code denoting the first-type calculation and the second-type calculation, and two operands representing the two variables. The arranging unit arranges the generated special instruction into a long-word instruction.
This structure generates an object program, composed of a plurality of long-word instructions. Special instructions supported by the target processor are embedded in certain of the plurality of long-word instruction

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Processor for making more efficient use of idling components... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Processor for making more efficient use of idling components..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Processor for making more efficient use of idling components... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2851295

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.