Data processing: software development – installation – and managem – Software program development tool – Translation of code
Reexamination Certificate
1998-03-10
2001-09-04
Powell, Mark R. (Department: 2122)
Data processing: software development, installation, and managem
Software program development tool
Translation of code
C717S152000
Reexamination Certificate
active
06286133
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates generally to a method and apparatus for compiling from a source language to two or more target languages, where the capabilities of each target language are significantly different. Target languages might be SQL (Structured Query Language) for data accesses, and C, for general computation.
BACKGROUND OF THE INVENTION
The normal processing steps within a compiler are: use parsing and lexical analysis to create a ‘parse tree’ from the ‘source language’; (optionally) re-organise that parse tree to reduce unnecessary computation (‘optimisation’); and a final, single pass of the parse tree to generate code in the ‘target’ language(s).
A problem with this approach is that, once code generation is started, each parse tree node is most conveniently translated into a single standard program structure in the target language(s). If only a single program structure or pattern is available, then that pattern must be very general to function correctly for every possible invocation. Such a general form typically performs slowly. This problem is exacerbated if each node can only be translated into one of several target languages.
If this approach is inadequate, then non-local information (that is, information from nodes elsewhere in the parse tree) must be available to make generally complex decisions as to which is the best pattern to use. This may involve information from parse tree nodes remote from the parse tree currently being processed, and complex, time-consuming and error-prone decision-making processing on a node-by-node basis.
This decision-making process becomes particularly difficult when code is emitted in two or more target languages, where code corresponding to some nodes can be emitted only in one language (for example, a data access and query language, such as SQL), for some other nodes, code can only emitted in another language (for example, a general purpose programming language, such as C), and for yet other nodes, either language can be used.
Further difficulties arise when one or more of the target languages have several different ways of expressing very similar processing, where one approach is fast, but can only be used in limited circumstances, while other approaches are more general, but typically execute more slowly. For example, in SQL, the choice of whether to use a cursor or inline data access code is particularly difficult to make.
What follows is a definition of terms that will be used in the following description of the present invention and specific examples thereof.
Code generation (‘emitting code’): the process of traversing a parse tree (q.v.) and generating a program in one or more lower-level target languages (q.v.).
Implementation Language: the programming language(s) in which a compiler itself is written. The implementation language need not be the same as either the source language (q.v.) or target language(s) (q.v.).
Parse tree: a graph, usually a ‘tree’, or a ‘directed acyclic graph’, of parse tree nodes (q.v.) which represent the semantics and behaviour of a computer program in the source language (q.v.), or a portion thereof.
Parse tree node (‘node’): part of a parse tree (q.v.), representing a single operation or function in the source language (q.v.).
Parsing: the process of converting a textual representation of a programming language (source language) into a parse tree (q.v.).
Source Language (‘source code’): The input language to a compiler. Usually, the source language for a compiler is a high-level language, such as a business modelling language.
Target Language (‘target code’): The output language from a compiler. Usually, the target language(s) of a compiler are a low-level language, such as assembly code, C or SQL. Often, the target language(s) may be further processed; for example, by another compiler or assembler.
SUMMARY OF THE INVENTION
In accordance with the present invention, there is now provided a method for compiling a source program written in a source language into a compiled program comprising a plurality of different target languages, the method comprising: generating a parse tree from the source program, the parse tree comprising a plurality nodes each representative of an operation in the source program; determining, for each successive node of the parse tree, the target language into which the node can be compiled based on the operation represented by the node; generating, based on the target language determination, a list of start points each corresponding to a different node of the parse tree and each indicating the target language into which the node is to be compiled; and, generating successive sections of the compiled program each based on each node of the parse tree between successive start points in the list and each being generated in the target language specified by the first of the successive start points.
Viewing the present invention from another aspect, there is now provided apparatus for compiling a source program written in a source language into a compiled program comprising a plurality of different target languages, the compiler comprising: means for generating a parse tree from the source program, the parse tree comprising a plurality nodes each representative of an operation in the source program; means for determining, for each successive node of the parse tree, the target language into which the node can be compiled based on the operation represented by the node; means for generating, based on the target language determination, a list of start points each corresponding to a different node of the parse tree and each indicating the target language into which the node is to be compiled; and, means for generating successive sections of the compiled program each based on each node of the parse tree between successive start points in the list and each being generated in the target language specified by the first of the successive start points. It will be appreciated that the present invention extends to a computer system including such compiling apparatus.
In preferred embodiments of the present invention, code generation is divided into two parts:
(1) Placing Start Points
Before code generation starts, an initial pass of the entire parse tree is made. During this pass, strategic decisions about the correct and most efficient form of the target language(s) are made. These strategic decisions are recorded in an ordered list of start points, each of which reference a particular parse tree node.
The initial strategic decisions are based on the kind of the current parse tree node, as well as previous decisions made, and tests made by look-ahead down the parse tree. Note, however, that this method reduces the need to perform decision-making look-ahead during code generation.
Different start points are each of a particular kind. Each kind of start point indicates the general form (pattern) of the target code to be emitted, and which target language is to be used. Start points also refer to their previous (outer) start point, so that previous decisions can be changed or reversed. For example, the kind of a start point can be changed subsequently, if it is later discovered that a code pattern represented by a particular start point would be incorrect or inefficient. Further start points can be added, or start points can be removed or re-ordered.
(2) Processing Start Points
The list of start points is used in order, each representing the ‘starting point’ for code generation. Code generation therefore starts from the parse tree node referenced by each start point, and progresses down the parse tree until another start point is encountered. Start points therefore also function as ‘stop points’.
The kind of start point indicates the pattern of the emitted code, and which of several target programming languages should be used. For example, the kind of a start point might indicate that a function definition (in C) should be emitted, or that a database access is needed (using inline SQL), or that an SQL cursor should be processed (using both C and SQL).
Some code gener
August Casey P.
International Business Machines - Corporation
Nguyen-Ba Hoang-Vu Antony
Powell Mark R.
Scully Scott Murphy & Presser
LandOfFree
Method and apparatus for strategic compilation of source... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for strategic compilation of source..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for strategic compilation of source... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2549125