Method and apparatus for data flow analysis

Data processing: software development – installation – and managem – Software program development tool – Translation of code

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C717S152000, C717S152000

Reexamination Certificate

active

06226789

ABSTRACT:

BACKGROUND OF THE INVENTION
This invention relates generally to binary translation and compilation of source code for use in computer systems and more particularly to data flow analysis in translation and compiliation systems.
As it is known in the art, computer systems which generally include a central processing unit (CPU), a main memory, and an input-output device interconnected by a system bus are used to execute computer programs to perform a useful task. One type of computer program is an operating system which is used to interface the CPU to an application program. The aforementioned application program is used by a user of the computer system to perform the useful task.
The operating system includes the software resources needed by the computer system to interface hardware elements to the computer system as well as to interface the application program and other programs to the computer system.
Application programs typically include programs, such as word processors, which execute on the computer system under the control of the operating system. One technique commonly used to produce a binary image includes compiling source code written in a programming language to produce object code. The object code is typically linked with a linker to produce the binary image. Another technique is to convert a executable image of an application program for execution in a computer system of a first computer architecture into a executable for a different computer architecture. Generally, without some conversion binary images made for one particular architecture and operating system cannot execute on a different architecture and/or operating system.
A compiler generally includes a front end which converts the source code into an intermediate representation and a back end which for example optimizes the intermediate language representation to permit the production of efficient object code. Data flow analysis is typically performed as part of the optimizations. In a complier a routine is generally used as the basic unit upon which data flow analysis is performed.
Optimizers included in existing binary translators that perform data flow analysis typically perform such analysis using a basic block, rather than a routine. A basic block is generally defined as one or more intermediate language statements with no embeded control flow, e.g., no intervening entry or exit point. A routine generally comprises many basic blocks. An optimizer in binary translator generally uses the basic block as the unit of optimization processing due to the lack of program structure in the binary image.
Generally, data flow analysis examines a flow graph representation of a computer program determining the state of the computer program, for example, the value of a program variable, at different points in a optimizable unit of the program. The possible execution control flow paths that comprise the unit are examined and typically, information is gathered about what can be true at various points in paths of the flow graph. For example, the values of a program variable are determined for each different possible execution path in the computer program. If data flow analysis indicates that a program variable is never used or read in a computer program, that variable and its associated storage can typically be eliminated.
One problem encountered in performing data flow analysis in compilers is how to represent the results of the analysis in a manner which affords efficient use of computer system resources, such as an organization of the analysis information that minimizes storage requirements while the organization also affords efficient retrieval of the analysis information.
A problem also encountered in performing data flow analysis in binary translation, as with compilers as previously discussed, is how to represent the results of the analysis in a manner which affords efficient use of computer system resources.
Techniques currently used by compilers in storing and retrieving data flow analysis information generally have problems with efficiency and are not readily scalable to larger programs without introducing undesirable restrictions and drawbacks.
One common mechanism used to represent data flow analysis information in an optimizing compiler includes using a first technique for representing global data flow information and a second, different technique for representing local data flow information.
One way in which compilers typically represent global data flow analysis information is a “dense” representation, such as using bit vectors to represent summary results of data flow analysis. Generally, multiple bit vectors are associated with each basic block to represent global information about the basic block with respect to other basic blocks. A position in a bit vector is reserved for each data item. For example, a representation of the data flow analysis information associates two (2) bit vectors representing global data flow analysis information with each basic block. A first bit vector (GEN) has a bit corresponding to each data item that is known to at least one other basic block. A bit is set within GEN if the corresponding data item is defined within the basic block. A second bit vector (USE) also contains a bit corresponding to each data item known to at least one other basic block. A bit is set within USE if there is an upwardly exposed use of the corresponding data item within the basic block. That is, the bit is set if the data definition within the basic block depends on another data definition outside of the basic block.
This dense representation of data flow analysis information tends to be computationally inefficient when gathering information needed to perform optimizations and expensive in terms of execution time and storage. A large fixed amount of storage is required. In reference to the previous example of representing global information using two (2) bit vectors per basic block, if there are “N” data items and “M” basic blocks, the fixed storage cost for only global information is the product of (“N”* “M”*2). In computational complexity terms, the storage cost is of order O(n
2
). This technique is generally not scaleable since, as routines and programs become larger, the fixed amount of storage tends to increase as a power of two (2).
As an improvement to reduce the amount of fixed storage needed, some implementations have limited the data definitions that are tracked locally and/or globally. For example, an implementation performs global optimizations upon only certain types of variables. This decreases the amount of storage required, but, does not provide direct or unified connectivity with one information data structure. Additionally, this improvement adds the restriction of limiting the number and types of data definitions upon which optimizations can be performed.
Another drawback of the bit vector representation is that a bit vector is not an information “rich” data structure. Bit vectors typically represent derived or summary information about a basic block. Not much information can be recorded and retrieved from one bit of storage per data item. For example, in performing an optimization using the global data flow information, it is necessary to locate the precise instruction or statement within the basic block that is defining the data item. Bit vectors do not capture this detailed information. Therefore, another mechanism is needed to provide the more specific, detailed information usually required about global data flow.
Typically, one of two techniques is used to obtain the additional detailed information associated with global data flow analysis. One technique uses a subordinating data structure to represent the detailed information, for example, such as a pointer to the actual instruction that performs a data item definition and a data item reference. A second technique used is to locate the actual instruction(s) needed at a particular instance during optimization by performing a sequential search, for example, of instructions within a basic block.
In addition to the large amount

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for data flow analysis does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus for data flow analysis, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for data flow analysis will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2572387

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.