METHOD AND SYSTEM FOR PROCESSING PROGRAM FOR PARALLEL...

Data processing: software development – installation – and managem – Software program development tool – Translation of code

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C717S146000, C717S152000, C717S159000

Reexamination Certificate

active

06760906

ABSTRACT:

BACKGROUND OF THE INVENTION
The present invention generally relates to program processing technology applicable to a compiler for translating all the source code of a program written in a high-level language into object code, and more particularly relates to code optimization technology specially designed for a parallel processor.
As the performance and functions of various microprocessor application components have been increasingly enhanced these days, microprocessors with even higher performance are in higher and higher demand. To meet such a demand, parallel processors like very-long-instruction-word (VLIW) processors have been developed to realize still higher performance through parallel processing.
For example, a VLIW processor is supposed to process in parallel a combination of operations that are packed in each and every word of a program. The combination is made by a compiler, which extracts a number of parallelly executable instructions from a source program and then combines them into a single long instruction word, thereby shortening the time taken for the VLIW processor to execute the program.
According to the compilation technology, instructions are rearranged on a “basic block” basis, which means a set of instructions to be executed consecutively without any branch or halt. That is to say, a basic block is a collection of instructions controllable continuously from the first through last ones.
A VLIW processor, on the other hand, generally executes instructions included in a long, fixed-length word with a fixed parallelism index. Thus, the code conversion efficiency attainable by the VLIW processor is not always good. To eliminate such a problem, a VLIW processor for executing a variable number of instructions included in a variable-length word was developed recently. In the VLIW processor of this type, a set of instructions to be executed in parallel is divided into a plurality of groups on respective parallel execution boundaries, thereby making the number of instructions issued per cycle (i.e., index of parallelism) variable. In addition, the VLIW processor executes an instruction word of a variable length so as to improve the code conversion efficiency. In this specification, a group of instructions included between an adjacent pair of execution boundaries will be called an “execution unit (of instructions)”. Furthermore, the VLIW processor can also execute a plurality of units of instructions concurrently while branching and recombining the processing flow.
As described above, the processor of this type rearranges the instructions on the basic block basis. Thus, if the compilation technology is applied to the processor of this type, even a set of instructions, which could be included in a single execution unit otherwise, might be unintentionally divided into several execution units on the basic block boundaries. As a result, the program execution rate attainable by such a processor cannot be regarded sufficiently high considering the potential performance of the VLIW processor.
SUMMARY OF THE INVENTION
An object of the present invention is increasing the program execution rate of a target machine in performing program processing for parallel processing purposes.
Specifically, an inventive program processing method for parallel processing includes the step of a) subdividing each of a plurality of basic blocks, into which program code has been divided, into a multiplicity of execution units. Each of the execution units is made up of parallelly-executable instructions. The method further includes the step of b) combining two of the execution units, which are located just before and after a basic block boundary, into a single execution unit if these execution units are executable within a single cycle.
According to the present invention, if a pair of execution units, which are located just before and after a basic block boundary, are found executable within a single cycle, then these execution units are combined into a single execution unit. Even a group of instructions ranging across a basic block boundary, or covering multiple basic blocks, are executable by a target machine within a single cycle if these instructions are classified as single execution unit. Accordingly, it is more probable that a group of instructions ranging across a basic block boundary are executable in parallel, thus cutting down the number of cycles to be done by the target machine to execute a program. As a result, the program execution rate increases.
In one embodiment of the present invention, the step b) preferably includes analyzing dependence between instructions belonging to the two execution units located just before and after the basic block boundary. And it is determined based on the analyzed dependence between the instructions whether or not these execution units located just before and after the basic block boundary are combinable into the single execution unit.
In another embodiment of the present invention, an execution boundary code indicating the identity as boundary between an associated pair of the execution units is preferably added to each said basic block boundary in the step a). In the step b), if the execution units located just before and after the basic block boundary are executable within the single cycle, then the execution boundary code that has been added to the basic block boundary is preferably removed.
In still another embodiment, instructions, which impose less strict constraints on resources of a target machine for executing the program, are selected preferentially in the step a) as respective instructions belonging to first and last execution units of each said basic block.
In this particular embodiment, when the instructions belonging to the first and last execution units of the basic block are selected in the step a), an instruction that is executable only by itself a cycle by the target machine is preferably given a lower priority.
In still another embodiment, when instructions belonging to first and last execution units of each said basic block are selected in the step a), an instruction with a short word length is preferably given a higher priority.
Another inventive program processing method for parallel processing includes the step of a) subdividing each of a plurality of basic blocks, into which program code has been divided, into a multiplicity of execution units. Each of the execution units is made up of parallelly-executable instructions. In the step a), a particular one of the basic blocks is preferably subdivided into a set of execution units along with an instruction belonging to one of the other execution units that is closest to the particular basic block. The closest execution unit belongs to another set of execution units, into which another one of the basic blocks that is adjacent to, and combinable with, the particular basic block has already been subdivided.
According to the present invention, a particular basic block is subdivided into a set of execution units along with an instruction belonging to one of the other execution units that is closest to the particular basic block. The closest execution unit belongs to another set of execution units, into which another basic block that is adjacent to, and combinable with, the particular basic block has already been subdivided. Thus, a number of instructions covering multiple basic blocks across a basic block boundary are more likely to be combined into a single execution unit. Even a group of instructions ranging across a basic block boundary are executable by a target machine within a single cycle if these instructions are classified as single execution unit. Accordingly, it is more probable that a group of instructions covering several basic blocks across a basic block boundary are executable in parallel, thus cutting down the number of cycles to be done by the target machine to execute a program. As a result, the program execution rate increases.
In one embodiment of the present invention, one of the basic blocks that is located just before the particular basic block may be used a

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

METHOD AND SYSTEM FOR PROCESSING PROGRAM FOR PARALLEL... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with METHOD AND SYSTEM FOR PROCESSING PROGRAM FOR PARALLEL..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and METHOD AND SYSTEM FOR PROCESSING PROGRAM FOR PARALLEL... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3203431

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.