Data processing: software development – installation – and managem – Software program development tool – Translation of code
Reexamination Certificate
1997-09-16
2001-07-31
Powell, Mark (Department: 2122)
Data processing: software development, installation, and managem
Software program development tool
Translation of code
C717S152000, C717S152000
Reexamination Certificate
active
06269477
ABSTRACT:
TECHNICAL FIELD
This invention relates to a method and system for optimizing a computer program image and, more particularly, to a method and system for rearranging code portions of the program image to reduce the working set.
BACKGROUND OF THE INVENTION
Many conventional computer systems utilize virtual memory. Virtual memory provides a logical address space that is typically larger than the corresponding physical address space of the computer system. One of the primary benefits of using virtual memory is that it facilitates the execution of a program without the need for all of the program to be resident in main memory during execution. Rather, certain portions of the program may reside in secondary memory for part of the execution of the program. A common technique for implementing virtual memory is paging; a less popular technique is segmentation. Because most conventional computer systems utilize paging instead of segmentation, the following discussion refers to a paging system, but these techniques can be applied to segmentation systems or systems employing paging and segmentation as well.
When paging is used, the logical address space is divided into a number of fixed-size blocks, known as pages. The physical address space is divided into like-sized blocks, known as page frames. A paging mechanism maps the pages from the logical address space, for example, secondary memory, into the page frames of the physical address space, for example, main memory. When the computer system attempts to reference an address on a page that is not present in main memory, a page fault occurs. After a page fault occurs, the operating system copies the page into main memory from secondary memory and then restarts the instruction that caused the fault.
One paging model that is commonly used to evaluate the performance of paging is the working set model. At any instance in time, t, there exists a working set, w(k, t), consisting of all the pages used by the k most recent memory references. The operating system monitors the working set of each process and allocates each process enough page frames to contain the process' working set. If the working set is larger than the number of allocated page frames, the system will be prone to thrashing. Thrashing refers to very high paging activity in which pages are regularly being swapped from secondary memory into the pages frames allocated to a process. This behavior has a very high time and computational overhead. It is therefore desirable to reduce the size of (i.e., the number of pages in) a program's working set to lessen the likelihood of thrashing and significantly improve system performance.
A programmer typically writes source code without any concern for how the code will be divided into pages when it is executed. Similarly, a compiler program translates the source code into relocatable machine instructions and stores the instructions as object code in the order in which the compiler encounters the instructions in the source code. The object code therefore reflects the lack of concern for the placement order by the programmer. A linker program then merges related object code together to produce executable code. Again, the linker program has no knowledge or concern for the working set of the resultant executable code. The linker program merely orders the instructions within the executable code in the order in which the instructions are encountered in the object code. The computer program and linker program do not have the information required to make a placement of code within an executable module to reduce the working set. The information required can in general only be obtained by actually executing the executable module and observing its usage. Clearly this cannot be done before the executable module has been created. The executable module initially created by the compiler and linker thus is laid out without regard to any usage pattern.
As each portion of code is executed, the page in which it resides must be in physical memory. Other code portions residing on the same page will also be in memory, even if they may not be executed in temporal proximity. The result is a collection of pages in memory with some required code portions and some unrequired code portions. To the extent that unrequired code portions are loaded into memory, valuable memory space may be wasted, and the total number of pages loaded into memory may be much larger than necessary.
To make a determination as to which code portions are “required” and which code portions are “unrequired,” a developer needs execution information for each code portion, such as when the code portion is accessed during execution of the computer program. A common method for gathering such execution information includes adding instrumentation code to every basic block of a program image. A basic block is a portion of code such that if one instruction of the basic block is executed then every instruction is also executed. The execution of the computer program is divided into a series of time intervals (e.g., 500 milliseconds). Each time a basic block is executed during execution of the computer program, the instrumentation code causes a flag to be set for that basic block for the current time interval. Thus, after execution of the computer program, each basic block will have a temporal usage vector (“usage vector”) associated with it. The usage vector for a basic block has, for each time interval, a bit that indicates whether that basic block was executed during that time interval. The usage vectors therefore reflect the temporal usage pattern of the basic blocks.
After the temporal usage patterns have been measured, a paging optimizer can rearrange the basic blocks to minimize the working set. In particular, basic blocks with similar temporal usage patterns can be stored on the same page. Thus, when a page is loaded into main memory, it contains basic blocks that are likely to be required.
The minimization of the working set is an NP-complete problem, that is, no polynomial-time algorithm is known for solving the problem. Thus, the time needed to minimize the working set of a program image generally increases exponentially as the number of code portions increase (i.e., O(e
n
), where n is the number of code portions). Because complex program images can have thousands, and even hundreds of thousands, of code portions, such an algorithm cannot generate a minimum working set in a timely manner even when the most powerful computers are employed. Because the use of such algorithms are impractical for all but the smallest program images, various algorithms are needed to generate a layout that results in an improved working set (albeit not necessarily the minimal working set) in a timely manner.
SUMMARY OF THE INVENTION
The present invention provides a computer-based layout system for establishing a program image layout of a computer program. The computer program comprises code portions that each have a temporal usage pattern indicating time intervals during which the code portion is required to be loaded into memory for access. The layout system repeatedly establishes the order of pairs of code portions until the order of all the code portions is established.
The layout system first selects an ordered pair of code portions which is estimated to have the smallest effect on the performance of the computer program taking into considerations all other code portions whose ordering has already been established relative to the selected ordered pair of code portions when the code portions of the selected ordered pair are positioned both in that order and contiguously in the program image. The layout system then establishes that the selected ordered pair of code portions are to be positioned both in the order and contiguously in the program image. The layout system repeats this selecting and establishing until the order of all the code portions is established.
In another aspect, the present invention provides techniques for estimating the effect that the ordering of two code portions would have on the p
Douceur John R.
Fitzgerald Robert P.
Miller John W.
Das Chameli Chaudhuri
Merchant & Gould P.C.
Microsoft Corporation
Powell Mark
LandOfFree
Method and system for improving the layout of a program... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and system for improving the layout of a program..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for improving the layout of a program... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2556886