Electrical computers and digital processing systems: processing – Processing control – Branching
Reexamination Certificate
1998-01-09
2001-09-11
Treat, William M. (Department: 2783)
Electrical computers and digital processing systems: processing
Processing control
Branching
C712S240000
Reexamination Certificate
active
06289441
ABSTRACT:
BACKGROUND OF THE INVENTION
The present invention relates to the field of microprocessor architecture. Specifically, the invention relates to a method and apparatus for performing multiple branch predictions per cycle.
Reduced instruction set computers, commonly referred to as RISC processors, are one of the more common computer architectures in use today. In a nutshell, RISC processors rely on simple, low level instructions of the same size. Instruction execution is broken up into various segments and processed in a multistage pipeline. The pipeline is structured such that multiple instructions may be processed at any given instant. For example, a five-stage pipeline may include separate stages for fetching an instruction from memory (instruction fetch stage), decoding the instruction (decode stage), fetching operands the instruction needs (operand fetch stage), executing the instruction (execution stage) and writing the results back to the appropriate register or memory location (write back stage). Since each stage can process an instruction and there are five stages, up to five instructions can be processed at once in such a pipeline.
Thus, such a RISC computer can theoretically achieve performance equivalent to executing one instruction per clock cycle. To achieve higher performance standards, however, more than one instruction needs to be processed in each stage. This higher standard of performance can be achieved by superscalar processors. Superscalar processors are generally based on RISC architecture and incorporate multiple instruction pipelines. For example, one superscalar processor, the Ultrasparc manufactured by SUN Microsystems, includes six separate instruction pipelines: two for floating point calculations/graphics operations, two for integer calculations, one for branch operations and one for memory operations. Theoretically, a superscalar processor having six separate pipelines can process up to six instructions per clock cycle.
One limiting factor as to how many instructions can be processed per clock cycle in RISC, superscalar and other processors that employ instruction pipelines is branch instructions. When a processor executes code containing a branch instruction, the earliest the processor could possibly recognize that the branch is to be taken is at the instruction decode stage. At this point, however, the next instruction has already been fetched and possibly other actions have been taken. Thus, the fetched instruction and other actions must be discarded and a new instruction (the branch target) must be fetched. This problem is compounded because branches are common occurrences. Studies have shown that branch instructions generally occur about as often as once every five to ten instructions.
One way designers have addressed the branch problem is to implement elaborate schemes to predict whether a branch is likely to be taken and then fetch the branch target address as the next instruction rather than the next sequential instruction as appropriate. One such method is as described in Yeh Tse-Yu's Ph.D Dissertation: “Two level Adaptive Branch Prediction and Instruction Fetch Mechanisms for High Performance Superscalar Processors.” A drawback to this method, however, is that only one branch instruction is predicted per fetch cycle. While this may be acceptable for a microprocessor with a limited number of pipelines, as the number of pipelines increases, there is a greater chance of multiple branch instructions being processed in one fetch cycle.
SUMMARY OF THE INVENTION
The present invention offers a method and apparatus for performing multiple branch predictions per fetch cycle. This allows a superscalar design with a large number of pipelines to avoid stalls when there are multiple branch instructions in a fetch bundle. A branch prediction table is configured to provide multiple predictions in parallel, and a branch handling unit is provided, which can process several branch instructions in parallel.
In a preferred embodiment according to the present invention, a microprocessor for handling multiple branch predictions per cycle includes an instruction cache for providing a current fetch bundle having a plurality of fetch instructions, a predecode array for determining which ones, if any, of the fetch instructions are branches and a branch prediction table for predicting which ones, if any, of the branch instructions are taken. The microprocessor also includes a branch logic (the term used in the figures to denote the branch handling unit) that combines the information from the predecode array and the predictions from the branch prediction table to identify the oldest taken branch, and a next fetch address table that provides the target address corresponding to the oldest taken branch.
A better understanding of the nature and advantages of the present invention may be achieved by a perusal of the detailed description with reference to the drawings.
REFERENCES:
patent: 4679141 (1987-07-01), Pomerene et al.
patent: 5276882 (1994-01-01), Emma et al.
patent: 5394529 (1995-02-01), Brown, III et al.
patent: 5504870 (1996-04-01), Mori et al.
patent: 5560032 (1996-09-01), Nguyen et al.
patent: 5574871 (1996-11-01), Hoyt et al.
patent: 5604877 (1997-02-01), Hoyt et al.
patent: 5649178 (1997-07-01), Blaner et al.
patent: 5687338 (1997-11-01), Boggs et al.
patent: 5758112 (1998-05-01), Yeager et al.
patent: 5796998 (1998-08-01), Levitan et al.
patent: 5854761 (1998-12-01), Patel et al.
patent: 5857098 (1999-01-01), Talcott et al.
patent: 5870599 (1999-02-01), Hinton et al.
patent: 5875325 (1999-02-01), Talcott
Yeh et al., “A Comprehensive Instruction Fetch Mechanism for a Processor Supporting Speculative Execution,” Proceedings of the 25th Annual International Symposium on Microarchitecture, Micro 25, IEEE, pp. 129-139, Dec. 1-4, 1992.
Cherabuddi Rajasekhar
Panwar Ramesh K.
Patel Sanjay
Talcott Adam R.
Sun Microsystems Inc.
Townsend and Townsend / and Crew LLP
Treat William M.
LandOfFree
Method and apparatus for performing multiple branch... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for performing multiple branch..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for performing multiple branch... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2538959