Pipeline processor capable of reducing branch hazards with...

Electrical computers and digital processing systems: processing – Processing control – Branching

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C712S234000

Reexamination Certificate

active

06189092

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a processor for executing machine-language instruction sequences using pipeline processing, and especially relates to a processor for executing branch processing at high speed.
2. Description of the Prior Art
Pipeline processing is known as one of the fundamental techniques for achieving high-speed processing by a Central Processing Unit (CPU: hereinafter processor). In pipeline processing, a process dealing with one instruction is divided into smaller stages (pipeline stages), each pipeline stage being processed in parallel to speed up the processing. However, this technique is not effective in executing branch instructions which are used in loops because a stall can occur. This phenomenon is called a branch hazard. Due to branch hazards, the operational performance of pipeline processing does not reach an optimal level.
A specific example of a program where a branch hazard will occur is given below. The appended comments written after the semicolons show the contents of the separate instructions.
(Instruction 1) mov 0,i ; Transfer 0 into i.
L: ; Label showing a branch target.
(Instruction 2) add a,b,c ; Transfer a+b into c.
(Instruction 3) mul a,b,d ; Transfer a×b into d.
(Instruction 4) add i,l,i ; Add 1 to i.
(Instruction 5) cmp i,3 ; Compare i with 3.
(Instruction 6) bcc L ; Branch to L if i<3.
When executing the above program, the procedure in Instructions 2-5 is looped three times. In the program, the execution of Instruction 6 is followed by three stages of fetching, decoding, and executing Instruction 2 in the next three cycles. This results in a branch hazard over two cycles between the execution of Instruction 6 and the execution of Instruction 2.
As a technique for avoiding branch hazards, a processor is disclosed in Japanese Laid-Open Patent Application No. 8-314719.
In this technique, code that includes the first instruction of a loop is stored into a buffer just before the loop is started. When the program branches from the last instruction of the loop to the first instruction, the code is retrieved from the buffer and the first instruction is decoded and executed. With such an arrangement, the first instruction does not need to be fetched from an external memory each time the loop is executed, so that the branch hazards can be avoided.
However, the conventional processor described above has a drawback that its circuit is of large scale, since the processor needs a specific circuit for avoiding the branch hazards.
First, the processor has to be equipped with an adder that is specifically used for calculating a fetch address of code that follows the code including the first instruction of the loop while the code including the first instruction is stored into the buffer just before the loop is started. The calculated fetch address is then stored into an address buffer.
The processor also has to be equipped with a subtractor that is specifically used for calculating an address of the first instruction to be decoded using the fetch address that is retrieved from the address buffer when the processing branches from the last instruction to the first instruction.
This inclusion of the adder and the subtractor results in an increase in the hardware scale of the conventional processor described above.
SUMMARY OF THE INVENTION
It is an object of the present invention to provide a pipeline processor that can reduce branch hazards with a small-scale circuit.
The above object can be fulfilled by a processor for executing a program loop at a high speed using a register instruction which is set immediately before the program loop and a loop instruction which is set at an end of the program loop, the processor including: a fetch unit for fetching code from a memory; a decoding unit for decoding an instruction included in the fetched code; and an execution unit for executing the decoded instruction, the decoding unit including a decoded instruction counter for storing and updating a pointer specifying an instruction that is being decoded, the pointer being a sum of a fixed shift value and an address of the instruction that is being decoded, and the execution unit including: a storage unit for storing, when the decoding unit decodes the register instruction, code at a start of the program loop that has been fetched by the fetch unit into a first buffer and storing the pointer stored in the decoded instruction counter into a second buffer; and a high-speed branch unit for having the fetch unit fetch code, when the loop instruction has been decoded by the decoding unit and a branch condition is satisfied, starting from an address that corresponds to the pointer stored in the second buffer and for having the decoding unit decode the code stored in the first buffer, wherein the fixed shift value is determined so that the pointer in the second buffer corresponds to an address of code that follows the code stored in the first buffer.
Here, the fixed shift value may be equal to a storage size of the first buffer.
Here, the decoded instruction counter, when initialized, may store a sum of a start address and the fixed shift value.
With the stated construction, it is no longer necessary to perform the add operation when executing a register instruction nor the subtract operation when executing a loop instruction, since the pointer that is the sum of the currently-decoded instruction address and the fixed value is stored in the decoded instruction counter. With no need to include the specific adder nor the specific subtractor, the hardware scale of the processor can be reduced. Also, the processing can be speeded up when executing the register instruction and the loop instruction, since the above address calculations are no longer necessary.
Here, the execution unit may further include a branch unit for sending, when a branch instruction with an absolute address is decoded by the decoding unit, the absolute address to a fetched instruction counter in the fetch unit and sending a value, obtained by adding the fixed shift value to the absolute address, to the decoded instruction counter.
Here, when a branch instruction with a relative address is decoded by the decoding unit, the branch unit may send a value, obtained by adding the relative address to the pointer stored in the decoded instruction counter, to the decoded instruction counter and send a value, obtained by subtracting the fixed shift value from the value which is sent to the decoded instruction counter, to the fetched instruction counter.
With the stated construction, branch instructions with an absolute address and a relative address are executed using the pointer stored in the decoded instruction counter.
Here, the fetched instruction counter may include a register for storing a fetch address and an adder for incrementing the fetch address stored in the register, wherein, when the branch instruction with the absolute address is decoded by the decoding unit, the branch unit has the adder in the fetched instruction counter add the fixed shift value to the absolute address and sends an adding result to the decoded instruction counter.
With the stated construction, when executing a branch instruction with an absolute address, the calculation of the pointer to be stored in the decoded instruction counter can be performed not by the execution unit but by the adder in the fetched instruction counter.


REFERENCES:
patent: 4463422 (1984-07-01), Storer et al.
patent: 4714994 (1987-12-01), Oklobdzija et al.
patent: 4725947 (1988-02-01), Shonai et al.
patent: 4807126 (1989-02-01), Goutu et al.
patent: 5450585 (1995-09-01), Johnson
patent: 5506976 (1996-04-01), Jaggar
patent: 5522053 (1996-05-01), Yoshida et al.
patent: 5617550 (1997-04-01), Matsuo et al.
patent: 5848268 (1998-12-01), Matsuo
patent: 5850551 (1998-12-01), Takayama et al.
patent: 270310 (1988-06-01), None
patent: 487082 (1992-05-01), None
patent: 0742518 (1996-11-01), None
“Compiling for the Horizon Instruction Set,” by J.M. Draper, Conpar 88, Manchester, UK Sep. 12-16, 1988

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Pipeline processor capable of reducing branch hazards with... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Pipeline processor capable of reducing branch hazards with..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Pipeline processor capable of reducing branch hazards with... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2592915

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.