Data processing: software development – installation – and managem – Software program development tool – Translation of code
Reexamination Certificate
2000-03-14
2004-05-18
Zhen, Wei (Department: 2122)
Data processing: software development, installation, and managem
Software program development tool
Translation of code
Reexamination Certificate
active
06738967
ABSTRACT:
TECHNICAL FIELD
The present invention relates to electronic data processing, and more particularly concerns the compilation of source programs into object programs for execution on processors having multiple different architectures.
BACKGROUND
Fully optimizing a source program during the traditional compilation process for one specifically targeted computer architecture is difficult. Optimizing a program to run on many processors having vastly differing architectures is at best, only currently approximated in the current art. Additionally, optimization techniques that obtain peak performance on one specific architecture frequently can degrade the performance on other processors potentially, to a degree, that no optimization (at all) would be better. Today, a single source-code program is compiled to multiple versions of machine specific, executable, code. These “executables” contain machine specific code that has been uniquely optimized for each different processor the compiler can target. Each version of these “executables” requires separate handling, packaging, tracking, distribution, and upgrading. Alternatively, conventional interpreted programs such as Java run a single version of a universally executable code on different host processors using virtual machines. The virtual machines translate the universal executable into target specific machine code. These interpretive virtual machines see only one “byte-code” instruction at a time, whereas most optimization techniques are global, taking into consideration a context including many or all of the instructions before and/or after the current instruction. The traditional alternative, to these interpretive engines, requires distributing a program directly in source-code format for compilation and optimization by users. This has serious disadvantages and costs
Almost all computer programs begin life as source code written in a high-level language such as C++ or BASIC. A compiler then converts the source code into object code comprising low-level operations that can be directly executed in a specific processor engine such as the Intel Corp. Pentium® or the Digital Equipment Corp. Alphas microprocessors.
High-level languages allow programmers to write code without bothering with the details of how the high-level operations will be executed in a specific processor or platform. This feature also makes them largely independent of the architectural details of any particular processor. Such independence is advantageous in that a programmer can write one program for many different processors, without having to learn how any of them actually work. The single source program is merely run through a separate compiler for each processor. Because the compiler knows the architecture of the processor it was designed for, it can optimize the object code of the program for that specific architecture. Strategically placing intermediate calculations in registers (if they are available in the target processor) is much faster than storing and reloading them from memory. Different processors can vary greatly in the number of register available. Some processors permit out of order instruction execution and provide different address modes or none at all. Register renaming is permitted in some processors, but not in others. Parallel instruction execution can also differ greatly between different processors. Unrolling loops and pulling loop invariant loads and calculations out of loops are also known optimization techniques that cannot be employed profitably without knowledge of the available resources (i.e. registers) on each specific target processor. Also, some processors include facilities for instruction-level parallelism (ILP) that must be explicitly scheduled by the compiler versus other processors where ILP is exploited via greedy out-of-order hardware techniques. The most common approach is to compile the same source program separately for many different processors, optimizing different source versions for each specific target processor in order to best exploit the available registers and available ILP. This allows very machine specific optimization, but incurs an implicit overhead at great cost to the developer and software manufacturer. A program developer must produce, distribute, maintain, and upgrade multiple versions of the program. New or improved processors require the development and distribution of additional versions. This versioning problem is a significant cost in terms of bug fixes, upgrades and documentation. The versioning “tax” continues for the life of the product and can actually cause a product to fail in the market place due to a lack of profitability as a result of these hidden costs.
The versioning problem could be avoided altogether by distributing the original processor-independent source code itself This presents many problems. Different users might customize the source code in unusual and unanticipated ways. Distributing source code requires users to purchase a compiler and learn how to use it. Compilers of the type employed for major programs are large, expensive, and hard to use. Direct interpretation of high-level source code (e.g., BASIC and APL) is also possible, but is slow and wasteful of processor resources. Attempts in the 1970s to design processor architectures for direct hardware execution of high-level languages met with little success. Moreover, any public distribution of source code reveals information that enables unauthorized modification and other undesired uses of the code. For a number of reasons, the great majority of personal-computer application programs are fully compiled and shipped in the form of machine specific, object code, only.
Some languages are designed to be parsed to an intermediate language (IL) which contains machine instructions for an abstract virtual machine that is distributed to all users regardless of the architecture of the target processor they own. Programs in the Java language, for example, are distributed in a tokenized intermediate language called “byte codes.” Each user's computer, has an instantiation of the virtual machine that translates each byte-code of an IL program individually to a sequence target specific hardware instructions in real time as the developers program is executed. Because different virtual machines are available for different specific processors, a single intermediate-level program can be distributed for many different processor architectures. Current implementations of the idea for one universal instruction set that can run on any target processor or microprocessor have major performance problems. Since the IL is the same for every processor that it runs on, it is too generic to allow the target-specific translators to filly optimize the IL and thus they fail to create machine instructions that are even close to the quality of code produced by a conventional compiler that generates machine specific code for each specific target processor. Additionally, optimizations attempted for one target processor often exact a penalty on a different processor so severe, that it may nullify the benefit of performing any optimization at all. For example, consider a calculation whose result is used several times in a program. Doing the calculation once and saving the result to a register, greatly speeds the performance of a processor that has enough physical registers to hold that result until it is needed again. A processor having fewer registers may accept the register assignment, but store (i.e., spill) the register contents to memory when the translated machine instructions of the program try to use more registers than are physically implemented. Spilling the contents of registers to memory and, then reloading them where the values are required for reuse, can be slower than actually recalculating the value. On some architectures, no optimization at all—merely throwing away the result and actually redoing the calculation later—results in code that runs significantly faster than an inappropriate attempt at optimization. This same strategy would
Microsoft Corporation
Woodcock & Washburn LLP
Zhen Wei
LandOfFree
Compiling for multiple virtual machines targeting different... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Compiling for multiple virtual machines targeting different..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Compiling for multiple virtual machines targeting different... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3213626