Pipelined two-cycle branch target address cache

Electrical computers and digital processing systems: processing – Processing control – Branching

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C712S239000, C712S240000

Reexamination Certificate

active

06279105

ABSTRACT:

TECHNICAL FIELD
The present invention relates in general to data processing systems, and in particular, to the processing of branch instructions within a microprocessor.
BACKGROUND INFORMATION
In superscalar microprocessor architectures, branch instructions are executed by a separate branch processing unit in order to predict whether a particular branch instruction will be taken or not. Typically, such predictions are based on past performance pertaining to that particular branch instruction. This is implemented using a branch history table, which records such past performances.
Branch prediction improves the efficiency and speed by which instructions are executed within a microprocessor by fetching ahead of time the instructions predicted to be executed. Without branch prediction, the microprocessor would have to wait until the branch instruction was actually executed and the resultant target address calculated within one of the execution units before being able to fetch the instructions lying along the taken branch. The processor would then stall waiting for those instructions to be fetched from memory.
One type of branch prediction circuitry utilized within the branch processing unit is a branch target address cache (“BTAC”), which is a history table that associates a branch's target address with the branch instruction's address. Without a BTAC, each taken branch must be processed after it is fetched before the subsequent instructions are fetched. This normally causes at least a one cycle delay in the instruction fetch pipeline. In a standard BTAC, the BTAC is read in parallel with the instruction cache, and if there is data in the BTAC, then the address in the BTAC is used to feed the instruction cache and BTAC in the next cycle. If there is no data found in the BTAC (BTAC miss), then the sequential address is selected. This standard BTAC approach causes a limit to cycle time because it must be determined if there is a hit in the BTAC, and that hit indication must then be used to select the instruction cache address for the next cycle.
If the BTAC is made to be direct mapped (or set associative with a way prediction) and the address from the BTAC is always used, then much of the cycle time problems of a BTAC are solved. Such a solution requires that “not taken” branches be added to the BTAC, and has the following problems:
(1) “Not taken” branches must be included in the BTAC, which reduces the effective size of the BTAC;
(2) A group of instructions that do not have any “taken” branches will incur a fetch penalty the first time it is fetched; and
(3) The fact that the data out of the BTAC must feed back to the BTAC and the instruction cache causes some cycle time problems in very high frequency designs.
As a result of the foregoing, there is a need in the art for an improved BTAC design.
SUMMARY OF THE INVENTION
The present invention addresses the foregoing needs by implementing a two-cycle BTAC where the BTAC function is pipelined over two cycles. For a block of instructions, the two-cycle BTAC stores the target address of branches that are in the next block, instead of the address of the next block. The result is that the BTAC has the address of the block that follows the next block instead of the address of the next block. By having the BTAC be a two-cycle BTAC, there is enough time to do a directory compare to determine if a hit is occurring in the BTAC. If a hit does not occur, then the sequential address is fed into the instruction cache and the BTAC. Using a two-cycle BTAC solves the three problems noted above with one-cycle high frequency BTAC's.
The two-cycle BTAC will read two entries each time it is accessed, a sequential and a non-sequential entry. The sequential entry is used when the instruction that is being fetched with the BTAC entry has no “taken” branch predictions. The non-sequential entry is used when the instructions that are being fetched with the BTAC entry expects to have a “taken” branch. Because of the pipelined nature of the BTAC, it is always known when a branch is being fetched whether the branch will be predicted “taken” or “not taken”. When a branch is re-directed, the re-direction address is known from the normal branch processing function: the information that was not selected during fetch would be used the cycle after the re-direction (sequential if non-sequential is initially predicted, non-sequential if sequential is initially predicted).
The foregoing has outlined rather broadly the features and technical advantages of the present invention in order that the detailed description of the invention that follows may be better understood. Additional features and advantages of the invention will be described hereinafter which form the subject of the claims of the invention.


REFERENCES:
patent: 5515518 (1996-05-01), Stiles et al.
patent: 6067616 (2000-05-01), Stiles et al.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Pipelined two-cycle branch target address cache does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Pipelined two-cycle branch target address cache, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Pipelined two-cycle branch target address cache will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2512395

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.