Electrical computers and digital processing systems: processing – Processing control – Branching
Reexamination Certificate
2000-02-22
2004-04-13
Chan, Eddie (Department: 2183)
Electrical computers and digital processing systems: processing
Processing control
Branching
C712S024000, C712S210000
Reexamination Certificate
active
06721875
ABSTRACT:
FIELD OF THE INVENTION
The invention pertains to branch instructions which are executed by a processor, and more specifically, to increasing the reach of IP-relative branch instructions which are executed by a processor (IP=instruction pointer, also known as a program counter, or PC).
BACKGROUND OF THE INVENTION
Insufficient branch reach (also known as “span” or “range”) is a common weakness of RISC (reduced instruction set computer) architectures. For example, see
FIG. 7
, which provides a list of the branch instructions used in various RISC architectures, and the range of each of these branch instructions.
As server/workstation applications move to 64-bit architectures, it is believed that problems associated with insufficient branch reach will need to be addressed more and more frequently.
One 64-bit architecture is the Intel®/Hewlett-Packard® IA-64 architecture. A discussion of this architecture can be found in Intel's “IA-64 Application Developer's Architecture Guide” (Rev 1.0, Order Number 245188-001, May 1999), which is hereby incorporated by reference for all that it discloses.
IA-64 architecture provides two general types of branch instructions: the IP-relative branch and the indirect branch. An IP-relative branch instruction carries with it a signed, 20-bit offset. The target of an IP-relative branch instruction is determined by adding the instruction's offset to the value of an instruction pointer. Since IA-64's instruction pointer is 64-bits long, an IP-relative branch instruction cannot redirect the instruction pointer to any address within a 64-bit address space. Rather, an IP-relative branch instruction can only redirect the instruction pointer to addresses within ±16 MB. While from a static code generation point of view, a ±16 MB reach seems sufficient to reach anywhere in a compiled and linked module, in the context of dynamic code generation, dynamic linking, or dynamic optimization, a branch instruction with greater reach is called for in order to reach all the different pieces of code.
An indirect branch instruction is advantageous over an IP-relative branch instruction in that an indirect branch instruction can redirect an instruction pointer to any address within a 64-bit address space. However, there are drawbacks to frequent and/or dynamic use of indirect branching.
One drawback is that the execution of an indirect branch instruction must be preceded by a somewhat lengthy setup routine in which a 64-bit target is fetched into a general register and then moved into one of eight branch registers (or calculated, looked up, etc., and then moved to a branch register). Only then may the indirect branch instruction be executed. The execution of an indirect branch therefore requires the execution of an instruction sequence such as: MOVL, MOV_to_br, and BR.
Due to the speculative nature in which high performance processors fetch instructions, it is desirable to move an indirect branch target into one of the eight branch registers early, so as to insure the availability of the target when branch prediction hardware needs to rely on the accuracy of the target for branch prediction. By insuring the availability of the target for branch prediction purposes, the fetch of instructions from an incorrect code section can be avoided, and as a result, the flush of incorrectly fetched instructions from an instruction pipeline, as well as the injection of bubbles (or gaps) into the pipeline, can be avoided. Unfortunately, it is sometimes difficult to move an indirect branch target into a branch register at an early date. This is because the IA-64 architecture provides only eight branch registers. If a target is moved into one of these branch registers too early, it is possible that it will be overwritten prior to when it is needed. At the same time, there is a risk that any move of a target into a branch register will result in the overwrite of some other needed target.
An indirect branch's lengthy setup routine also creates problems for tools such as dynamic instrumentation and optimization tools. As will be explained below, each of these tools needs to “patch” compiled and linked program code with branches to new and/or optimized “patch code”. Patch code is simply an address space which is used for the storage of, for example, dynamic instrumentation and/or optimization routines. Patching involves either the static or dynamic insertion of branches to patch code in already compiled and linked program code. Each time a patch is made, an instruction which is replaced by a branch to patch code needs to be written into the patch code so that it eventually gets executed.
The patching of IA-64 program code with indirect branch code sequences is especially difficult since 1) IA-64 program code comprises instructions which are encoded in bundles of three instructions each, and 2) an indirect branch code sequence does not fit within a single instruction bundle. Since the instructions of an indirect branch sequence do not fit within a single instruction bundle, the patch of an indirect branch sequence (e.g., MOVL, MOV-to-br, and BR) into already compiled and linked binary program code requires two or more instruction bundles to be overwritten. However, once program code has been compiled and linked, information on which instruction bundles in the program code are the possible targets of branch instructions is not always known. If a branch instruction branches to an instruction bundle which has been replaced with part of a patch sequence, or if a branch instruction branches to an instruction bundle which falls between the bundles of a patch sequence, an exception could result for two reasons. First, it is likely that the patch sequence would not execute correctly, and second, if the patch sequence does not execute correctly, instructions which were copied out of the original program code to make way for the patch sequence bundles will never be executed. Patching with indirect branch sequences is also problematic in that such a sequence requires the use of available general and branch registers. A patching tool is unlikely to know the locations of such registers (if any even exist). Patching with indirect branch sequences is therefore an unsafe practice.
Caveats relating to the patching of multiple instruction bundles are further discussed in the article “Fine-Grained Dynamic Instrumentation of Commodity Operating System Kernels” by A. Tamches and B. Miller (CS Dept. of the University of Wisconsin-Madison, July 1998).
Two alternatives to patching with indirect branches exist. Each alternative involves the patching of only a single instruction bundle with a replacement instruction bundle incorporating an IP-relative branch to a “hole” in binary program code. A hole in binary program code is nothing more than a number of consecutively addressed memory locations which reside within the bounds of memory locations which store the binary program code.
The difference between the two alternatives is that the first alternative uses larger holes to store “patch code” (or portions thereof, whereas the second alternative uses smaller holes (known as springboards) for the insertion of “trampoline code”. Trampoline code is merely code that helps enable one to branch outside of the confines of more or less contiguous binary program code when the range of an IP-relative branch is not great enough to perform such a branch.
Since compiled and linked program code typically comprises few and randomly placed holes, single instruction bundle patching typically requires that holes be inserted into an executable at predetermined locations by a compiler/linker. Since the compiler/linker will not know which portions of an executable will be patched by dynamic instrumentation and/or optimization tools, the compiler/linker needs to insure that inserted holes can be reached from any instruction bundle in program code. Since an IA-64 IP-relative branch has a ±16 MB reach, holes must appear after approximately every 32 MB of IA-64 program code. Thus, a branch r
McCormick, Jr. James E
Soltis Jr. Donald Charles
Undy Stephen R.
Chan Eddie
Hewlett--Packard Development Company, L.P.
Huisman David J.
LandOfFree
Method and apparatus for implementing a single-syllable... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for implementing a single-syllable..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for implementing a single-syllable... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3244090