Method for reducing the frequency of cache misses in a computer

Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S170000, C711S165000, C711S133000, C711S125000, C711S134000, C707S793000

Reexamination Certificate

active

06301641

ABSTRACT:

BACKGROUND OF THE INVENTION
The invention relates to a method of selecting memory locations for placing instructions as described in the preamble of claim
1
.
SUMMARY OF THE INVENTION
A machine like a computer typically contains a main memory, a cache memory and a processor. During execution of a program the computer loads instructions from main memory for execution by the processor and copies these instructions into the cache memory. The cache memory contains cache lines, each of which may hold several instructions at a time. The memory contains locations, and the location at which an instruction is stored determines into which cache line that instruction will be copied. When the instruction is copied into the cache line a previous content of that cache line is no longer available from the cache memory. When the processor needs to execute an instruction, the computer will attempt to load that instruction from the cache memory. If the instruction is not available in the cache memory, a “cache miss” is said to occur and execution is delayed because the computer has to load the instruction from the main memory before the instruction can be executed.
A cache miss may occur when, after the time the instruction was last copied into a particular cache line, another instruction has been copied into that particular cache line. Whether this occurs depends on the main memory locations of the other instructions that have been executed since the time that the instruction was last copied into the cache line. If these locations are such that these instructions have to be copied into the particular cache line, a cache miss may occur. The number of cache misses can be minimized by a proper selection of these main memory locations so that copying into the particular cache line is not needed too often.
An article by Hiroyuki Tomiyama et al. titled “Optimal Code Placement of Embedded Software for Instruction Caches”, and published in the Proceedings of the European Design and Test Conference in Paris, Mar. 11 to 14, 1996 pages 96 to 101 describes a method of selecting locations for placing instructions in the main memory so as to minimize the number of cache misses.
According to this method a sample of execution of the program is obtained. The sample indicates which instructions the processor successively executes when the machine receives a given typical data input. Using the sample a linear function is derived which calculates the number of cache misses for the sample of execution as a function of the locations where the instructions of the program are placed in main memory. By means of an undescribed local search method a minimum of this linear function is found, which corresponds to optimal locations for placing the instructions in main memory.
This known method has the disadvantage that it is very time consuming. Even for relatively small programs a computer needs an excessive amount of execution time to find an optimum. The known method reduces this amount of execution time by grouping the instructions into blocks of instructions that are always executed as a whole, and by calculating the number of cache misses at the level of blocks instead of individual instructions. But even with this improvement a computer still needs hours of execution time to find an optimum for relatively small programs.
Additionally, the known method makes very inefficient use of the locations in main memory, because it divides the program into blocks of instructions and introduces unused locations between the blocks to enforce that each cache line will contain instructions from only one block.
Amongst others, it is an object of the invention to provide for an improved method of selecting locations for placing instructions in the main memory of a machine containing a cache memory.
The method according to the invention is characterized by the characterizing part of claim
1
. Potential selections are screened to select a potential selection which generally reduces the number of cache misses for the sample of execution of the program when the instructions are placed according to the potential selection instead of according to the original selection. This process is repeated, each time with the successor selection replacing the original selection.
In this method a computation of the number of cache misses is performed for potential successor selections. When a computer executes the method the computation of the number of cache misses is time-consuming, especially when it is performed for many potential selections. A score is used as a heuristic for deciding whether to compute the number of cache misses of potential selections from selections that differ from the original selection only by the movement of selected blocks. Once such a heuristically selected potential selection passes the screening, the computation of the number of cache misses for the further potential selections is omitted. This keeps the required time within practical bounds so that the method can be performed for large programs.
The program will be loaded into main memory according to the optimal selection. This main memory may be for example a conventional DRAM or a ROM or the like. Such a ROM can be used in a machine with a fixed, efficient program. The program can also be stored on a machine readable medium like a magnetic disk, in combination with information about the optimal selection for use in loading the program into the main memory according to the optimal selection. The machine containing the main memory will execute the program by fetching the instructions from main memory.
DESCRIPTION OF THE PREFERRED EMBODIMENTS
The method according to the invention has an embodiment wherein the respective score for each block is a count of executions of that block during the sample of execution, only executions being counted which both cause a cache miss and are separated from a directly preceding execution of that block by execution of other blocks of which an aggregate size is less than a size of the cache memory. Under this condition it is certain that the cache misses that are counted are all conflict misses, i.e. cache misses that can be avoided by assigning different locations to instructions. It is likely that the number of cache misses occurring during the sample of execution can be reduced more by movement of a block for which this count is higher. Therefore use of this count makes the search for an optimal selection efficient.
In another embodiment of the method according to the invention the set of at least one selected block is selected so that the respective scores indicate that each of the at least one selected blocks has at least as high a number of cache misses as any other block, the further blocks comprising all blocks for which the respective scores indicate that they have a lower number of cache misses than the at least one selected block or blocks. Movement of a block which causes the most cache misses is most likely to be a success for reducing the number of cache misses occurring during the sample of execution.
The method according to the invention has an embodiment wherein, for said testing whether any one of the set of potential selection produces a smaller number of cache misses, self-conflicting potential selections, in which any location selected for placement of an instruction of any one of the set of at least one selected block may cause a cache conflict with any location selected for placement of instructions of the same one of the set of at least one selected blocks in the original selection, are excluded from the set of potential selections. Thus, only those potential selections are considered in which the instructions in the selected block are placed at locations the contents of which will be loaded into different cache lines than in the original selection. This ensures that those cache misses will be eliminated that caused the score to indicate the highest number of cache misses. Thus, computation of the number of cache misses for potential selections that do not eliminate all the original cache misses caused by the

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method for reducing the frequency of cache misses in a computer does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method for reducing the frequency of cache misses in a computer, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for reducing the frequency of cache misses in a computer will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2582146

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.