Electrical computers and digital processing systems: processing – Processing control – Processing sequence control
Reexamination Certificate
2000-03-27
2003-12-02
Pan, Daniel H. (Department: 2183)
Electrical computers and digital processing systems: processing
Processing control
Processing sequence control
C712S229000, C712S203000, C712S215000, C712S219000
Reexamination Certificate
active
06658560
ABSTRACT:
BACKGROUND OF THE INVENTION
The present invention generally relates to a microprocessor with very long instruction word (VLIW), superscalar or out-of-order completion architecture. More particularly, the present invention relates to program translator and processor realizing parallel processing down to the level of individual instructions by making efficient use of execution units.
In recent years, various microprocessors, such as VLIW, superscalar and out-of-order completion types, have been developed one after another to execute multiple instructions at a time more rapidly.
Some of compilers, which designate a VLIW microprocessor as a target, define an instruction set and then parallelize the instructions included in the set in such a manner as to satisfy various constraints concerning the availability of execution units of the microprocessor or instruction slots of a long instruction word.
A program translator of this type is disclosed, for example, in Japanese Laid-Open Publication No. 5-265769.
If a source program shown at the top of
FIG. 6
is compiled using a prior art program translator, an instruction set shown in the middle of
FIG. 6
is generated from the source program. Next, the instructions included in this instruction set are parallelized to generate a set of long instruction words with a step number of 2 as shown at the bottom of FIG.
6
. In the second instruction slot of each long instruction word, a no-operation instruction (NOP) is inserted.
Also, if a program shown in
FIG. 25
is executed using a conventional superscalar processor, then the processor executes the instructions in 5 cycles by pipelining shown in FIG.
34
.
Furthermore, if a program shown in
FIG. 31
is executed using another conventional processor including a multiplier that can perform multiplication in 3 cycles, then the processor executes the instructions in 7 cycles by pipelining shown in FIG.
35
.
The prior art program translators, however, have various shortcomings. For example, an instruction set generated from source program is not always executable at a high parallelism level because some constraints are often imposed by a processor with limited execution units as targets. Accordingly, many NOP's should be inserted to parallelize the instructions, thus constituting a serious obstacle to performance enhancement.
Also, in the prior art superscalar processor, even if multiple instructions are decoded at a time, just part of these instructions are executable because available execution units are limited. Thus, the resultant performance is not fully satisfactory, either.
Furthermore, in still another prior art processor, if an execution unit should perform a sequence of operations each taking several clock cycles to execute, then succeeding operations cannot be started until these operations are completed. As a result, the performance of such a process is not so good.
SUMMARY OF THE INVENTION
An object of the present invention is providing a program translator that can obtain a set of instructions that have been parallelized to a high level of parallelism.
Another object of the present invention is providing a processor that can perform computational processing rapidly by making more efficient use of execution units.
To achieve these objects, according to the present invention, if there are two instructions that designate the same execution unit as their target, then one of the two instructions is replaced with another instruction that designates a different execution unit.
A program translator according to the present invention includes instruction exchanging means for exchanging one of instructions included in a program for another instruction.
The latter instruction specifies an operation equivalent to that specified by the former instruction and designates, as a target of the operation, an execution unit that is different from an execution unit designated as a target by the former instruction. The program translator further includes instruction parallelizing means for placing the instructions in the program, in which the former instruction has been exchanged for the latter instruction by the exchanging means, at such locations as being parallelly executable by a processor.
In one embodiment of the invention, the exchanging means may include equivalent instruction storage means for storing multiple instructions that specify equivalent operations but designate mutually different execution units as targets of the operations; instruction identifying means for identifying at least one of the instructions included in the program with one of the instructions stored on the storage means; and instruction replacing means for replacing the at least one instruction, which has been identified by the identifying means, with another one of the instructions that is also stored on the storage means but is different from the at least one instruction.
In another embodiment of the present invention, the program translator may further include parallelism-level calculating means for calculating a parallelism level of the instructions that have been parallelized by the instruction parallelizing means.
In still another embodiment, the instruction exchanging means may include equivalent instruction set storage means for storing multiple instruction sets specifying mutually equivalent operations. If two of the instruction sets each designate the same set of execution units as targets of their operations in the same order, these two instruction sets belong to the same group of instructions. The instruction exchanging means may further include: instruction subset identifying means for identifying a subset of the program with one of the instruction sets stored on the storage means;instruction group selecting means for selecting an instruction group that is different from a group to which the instruction set, identified by the identifying means with the instruction subset, belongs; and instruction set replacing means for replacing the instruction subset, which has been identified by the identifying means, with an instruction set included in the instruction group, which has been selected by the selecting means.
Another program translator according to the present invention includes: instruction parallelizing means for generating a set of parallelized instructions by placing instructions at such locations as being parallelly executable by a processor; equivalent instruction storage means for storing multiple instructions that specify equivalent operations but designate mutually different execution units as targets of the operations; no-operation instruction finding means for finding a no-operation instruction from the parallelized instructions located in a predetermined range of the parallelized instruction set; substitute instruction selecting means for selecting, if one of the parallelized instructions including the no-operation instruction found is the same as one of the instructions stored on the storage means, a substitute one of the instructions, which is also stored on the storage means but is different from the instruction included in the parallelized instructions; and instruction replacing means for replacing the instruction included in the parallelized instructions with the substitute instruction selected by the selecting means.
In one embodiment of the present invention, the program translator may further include: effective range searching means for searching the parallelized instruction set for a subset of instructions, which does not cause register conflict with any of the parallelized instructions; and second no-operation instruction finding means for finding a no-operation instruction from parallelized instructions included in the instruction subset that has been found by the searching means. The replacing means replaces the no-operation instruction, which has been found by the second finding means, with the instruction that has been selected by the selecting means.
A processor according to the present invention includes: a first execution unit; a second execution unit; and instruction para
Matsushita Electric - Industrial Co., Ltd.
McDermott & Will & Emery
Pan Daniel H.
LandOfFree
Program translator and processor does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Program translator and processor, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Program translator and processor will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3159052