Processor with N adders for parallel target addresses...

Electrical computers and digital processing systems: processing – Processing control – Branching

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C712S235000, C712S239000

Reexamination Certificate

active

06219784

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to the art of microprocessors and, more particularly, to circuits and methods within the microprocessor for generating target addresses.
2. Description of the Related Art
Superscalar processors attempt to achieve high performance by dispatching and executing multiple instructions per clock cycle, and by operating at the shortest possible clock cycle time. To the extent a given processor is successful at dispatching and/or executing multiple instructions per clock cycle, high performance may be realized. In order to increase the average number of instructions dispatched per clock cycle, processor designers have been designing superscalar processors that employ wider issue rates. A “wide issue ” superscalar processor is capable of dispatching a larger number of instructions per clock cycle when compared to a “narrow issue ” superscalar processor. During clock cycles in which a number of dispatchable instructions is greater than the narrow issue processor can handle, the wide issue processor may dispatch more instructions, thereby achieving a greater average number of instructions dispatched per clock cycle.
To support wide issue rates, the superscalar processor should be capable of fetching a large number of instructions per clock cycle (on the average). A processor capable of fetching a large number of instructions per clock cycle will be referred to herein as having a “high fetch bandwidth. ” If the superscalar processor is unable to achieve a high fetch bandwidth, then the processor may be unable utilize wide issue hardware if contained therein.
Several factors may impact the ability of a particular processor to achieve a high fetch bandwidth. For example, many code sequences have a high frequency of branch instructions, which may redirect the fetching of subsequent instructions within that code sequence to a target address specified by the branch instruction. Accordingly, the processor may identify the target address after fetching the branch instruction. The next instructions within the code sequence may be fetched using the branch target address. Processors attempt to minimize the impact of conditional branch instructions on the fetch bandwidth by employing highly accurate branch prediction mechanisms and by generating the subsequent fetch address (either target or sequential) as rapidly as possible.
As used herein, a branch instruction is an instruction that specifies, either directly or indirectly, the address of the next instruction to be fetched. The address may be the sequential address identifying the instruction immediately subsequent to the branch instruction within memory, or a target address identifying a different instruction stored elsewhere in memory. Unconditional branch instructions always select the target address, while conditional branch instructions select either the sequential address or the target address based upon a condition specified by the branch instruction. For example, the processor may include a set of condition codes which indicate the results of executing previous instructions, and the branch instruction may test one or more of the condition codes to determine if the branch selects the sequential address or the target address. A branch instruction is referred to as taken if the target address is selected via execution of the branch instruction, and not taken if the sequential address is selected. Similarly, if a conditional branch instruction is predicted via a branch prediction mechanism, the branch instruction is referred to as predicted taken if target address is predicted to be selected upon execution of the branch instruction, and is referred to as predicted not taken if the sequential address is predicted to be selected upon execution of the branch instruction.
Unfortunately, even if highly accurate branch prediction mechanisms are used to predict branch instructions, fetch bandwidth may still suffer. Typically, a run of instructions is fetched by the processor, and a first branch instruction within the run of instructions is detected. Fetched instructions subsequent to the first branch instruction are discarded if the branch instruction is predicted taken, and the target address is fetched. Accordingly, the number of instructions fetched during clock cycle in which a branch instruction is fetched and predicted taken is limited to the number of instructions prior to and including the first branch instruction within the run of instructions being fetched. Since branch instructions are frequent in many code sequences, this limitation may be significant. Performance of the processor may be decreased if the limitation to the fetch bandwidth leads to a lack of instructions being available for dispatch.
SUMMARY OF THE INVENTION
The present invention provides a circuit and method for generating a pair of target addresses in parallel in response to detecting a pair of conditional branch instructions within a run of instructions. In one embodiment, the circuit of the present invention is embodied within a microprocessor having an instruction run storage device, a branch scanner circuit, and a target address generation circuit. The instruction run storage device stores a fetched run of instructions which includes a plurality of instruction bytes. The branch scanner circuit is coupled to the instruction run storage. The branch scanner circuit operates to identify a pair of conditional branch instructions (i.e., first and second conditional branch instructions) within the run of instructions. The target address generation circuit is coupled to the branch scanner circuit. The target address generation circuit generates a first target address and a second target address in parallel and in response to the branch scanner circuit identifying the first and second conditional branch instructions. The first and second target addresses correspond to first and second target instructions, respectively, which can be executed in parallel if the first and second conditional branch instructions are predicted as taken.
The target address generation circuit further includes a multi-bit signal generator, a first target address generator, and a second target address generator. The multi-bit signal generator operates to generate multi-bit signals. The first target address generator generates the first target address, and the second target address generator generates the second target address. The multi-bit signal generator is coupled to receive N bytes of the run of instructions stored within the instruction run storage device. The multi-bit generator generates each multi-bit signal as a function of one of the N instruction bytes.
The target address generation circuit also includes an instruction byte partial address generator. This instruction byte partial address generator is coupled to receive at least a portion of a fetch address corresponding to the run of instructions in the instruction run storage device. The instruction byte partial address generator operates to generate instruction byte partial addresses corresponding to the N bytes stored within the instruction line storage device. Each of the N instruction partial byte addresses is generated as a function of the fetch address portion.
In one embodiment, the multi-bit generator further includes N adders, a first selection device, and a second selection device. The N adders generate N multi-bit signals by adding corresponding bytes of the run of instructions stored in the instruction run storage device and the instruction byte partial addresses. The first selection device has N inputs, an output, and a selection input. Each of the N inputs receives an output of one of the N adders. The selection input receives a first selection signal from the branch scanner circuit. The first selection device operates to select for output a first multi-bit signal provided by one of the N adders in response to receiving the first signal. The second selection device has N inputs, an output, and a selection input. Each of the N inputs receives

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Processor with N adders for parallel target addresses... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Processor with N adders for parallel target addresses..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Processor with N adders for parallel target addresses... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2510013

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.