Methods and apparatus to dynamically reconfigure the...

Electrical computers and digital processing systems: processing – Processing control – Processing sequence control

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C712S020000, C712S024000

Reexamination Certificate

active

06775766

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates generally to improvements in array processing, and more particularly to methods and apparatus for dynamically expanding and compressing the instruction pipeline of a very long instruction word (VLIW) processor.
BACKGROUND OF THE INVENTION
In an architecture, such as the manifold array (ManArray) processor architecture, very long instruction words (VLIWs) are created from multiple short instruction words (SIWs) and are stored in VLIW memory (VIM). A mechanism suitable for accessing these VLIWs, formed from SIWs 1-n, is depicted in FIG.
1
A. First, a special kind of SIW, called an “execute-VLIW” (XV) instruction, is fetched from the SIW memory (SIM
10
) on an SIW bus
23
and stored in instruction register (IR
1
)
12
. When an XV instruction is encountered in the program, the VLIW indirectly addressed by the XV instruction is fetched from VIM
14
on a VLIW bus
29
and stored in VLIW instruction register (VIR)
16
to be executed in place of the XV instruction by sending the VLIW from VIR
31
to the instruction decode-and-execute units.
Although this mechanism appears simple in concept, implementing it in a pipelined processor with a short clock period is not a trivial matter. This is because in a pipelined processor an instruction execution is broken up into a sequence of cycles, also called phases or stages, each of which can be overlapped with the cycles of another instruction execution sequence in order to improve performance. For example, consider a reduced instruction set computer (RISC) type of processor that uses three basic pipeline cycles, namely, an instruction fetch cycle, a decode cycle, and an execute cycle which includes a write back to the register file. In this 3-stage pipelined processor, the execute cycle of one instruction may be overlapped with the decode cycle of the next instruction and the fetch cycle of the instruction following the instruction in decode. To maintain short cycle times, i.e. high clock rates, the logic operations done in each cycle must be minimized and any required memory accesses kept as short as possible. In addition, pipelined operations require the same timing for each cycle with the longest timing path for one of the pipeline cycles setting the cycle time for the processor. The implications of the serial two memory accesses required for the aforementioned indirect VLIW operation in
FIG. 1A
is that for a single cycle operation to include both memory accesses would require a lengthy cycle time not conducive for a high clock rate machine. As suggested by analysis of
FIG. 1A
wherein the VIM address Offset
25
is contained within the XV instruction, the VIM access cannot begin until the SIM access has been completed. At which point, the VIM address generation unit
18
can create the VIM address
27
to select the desired VLIW from VIM
14
, by adding a stored base address with the XV VIM OffSet value. This constraint means that if the number of stages in a typical three-stage (fetch, decode, execute) instruction pipeline is to be maintained, both accesses would be required to be completed within a single clock cycle (i.e. the fetch cycle). However, due to the inherent delay associated with random memory accesses, even if the fastest semiconductor technologies available today are used, carrying this requirement to the actual implementation would restrict the maximum speed, and hence, the maximum performance, that could be attained by the architecture.
On the other hand, if an additional pipeline stage were to be permanently added such that the memory accesses are divided across two pipeline fetch stages (F1 and F2), an even more undesirable effect of increasing the number of cycles it takes to execute a branch would result.
SUMMARY OF THE INVENTION
The present invention addresses a dynamic reconfigurable pipeline and methods of its use which avoids both of the above described types of “delayed” and multi-cycle branch problems. Thus, this dynamic reconfigurable pipeline as discussed further below is highly advantageous.
A unique ManArray processor pipeline design in accordance with the present invention advantageously solves the indirect VLIW memory access problem without increasing branch latency by providing a dynamically reconfigurable instruction pipeline for SIWs requiring a VLIW to be fetched. By introducing an additional cycle in the pipeline only when a VLIW fetch is required, the present invention solves the VLIW memory access problem. The pipeline stays in an expanded state, in general, until a branch type or non-XV-VLIW type operation is detected returning the pipe to a compressed pipeline operation. By compressing the pipeline when a branch type operation is detected, the need for an additional cycle for the branch operation is avoided by the present invention. Consequently, the shorter compressed pipeline provides more efficient performance for branch intensive control code as compared to a fixed pipeline with an expanded number of stages.
In addition, the dynamic reconfigurable pipeline is scalable allowing each processing element (PE) in an array of PEs to expand and compress the pipeline in synchronism allowing independent iVLIW operations in each PE. This is accomplished by having distributed pipelines in operation in parallel, one in each PE and in the controller Sequence Processor (SP).
The present invention also allows the SIW memory and VLIW memory to have a full cycle for memory access time. This approach enables an indirect VLIW processor to achieve a higher frequency of operation because it minimizes the logic operations and number of memory access required per cycle. By using this approach, a more balanced pipeline design is obtained, resulting in a micro-architecture that is more suitable for manufacturing across a wide-range of process technologies.
These and other advantages of the present invention will be apparent from the drawings and Detailed Description which follow.


REFERENCES:
patent: 5450556 (1995-09-01), Slavenburg et al.
patent: 5485629 (1996-01-01), Dulong
patent: 5590368 (1996-12-01), Heeb et al.
patent: 5625835 (1997-04-01), Ebcioglu et al.
patent: 5649135 (1997-07-01), Pechanek et al.
patent: 5787303 (1998-07-01), Ishikawa
patent: 5826054 (1998-10-01), Jacobs et al.
patent: 5978822 (1999-11-01), Muwafi et al.
patent: 5983336 (1999-11-01), Sakhin et al.
patent: 6023757 (2000-02-01), Nishimoto et al.
patent: 6026478 (2000-02-01), Dowling
patent: 6026486 (2000-02-01), Kodama et al.
patent: 6044450 (2000-03-01), Tsushima et al.
patent: 6216223 (2001-04-01), Revilla et al.
patent: 6311262 (2001-10-01), Hachmann et al.
patent: 6397323 (2002-05-01), Yoshida
patent: 6484253 (2002-11-01), Matsuo

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Methods and apparatus to dynamically reconfigure the... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Methods and apparatus to dynamically reconfigure the..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Methods and apparatus to dynamically reconfigure the... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3313244

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.