Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories
Reexamination Certificate
2002-06-18
2004-10-26
Nguyen, T (Department: 2187)
Electrical computers and digital processing systems: memory
Storage accessing and control
Hierarchical memories
C711S141000, C711S144000, C711S213000, C712S233000, C712S237000, C712S239000
Reexamination Certificate
active
06810466
ABSTRACT:
FIELD OF THE INVENTION
This invention relates in general to the field of prefetch instructions in microprocessors, and more particularly to a microprocessor that selectively performs prefetch instructions depending upon the current level of processor bus activity.
BACKGROUND OF THE INVENTION
Most modern computer systems include a microprocessor that performs the computation necessary to execute software programs. The computer system also includes other devices connected to the microprocessor such as memory. The memory stores the software program instructions to be executed by the microprocessor. The memory also stores data that the program instructions manipulate to achieve the desired function of the program.
The devices in the computer system that are external to the microprocessor, such as the memory, are directly or indirectly connected to the microprocessor by a processor bus. The processor bus is a collection of signals that enable the microprocessor to transfer data in relatively large chunks, such as 64 or 128 bits, at a time. When the microprocessor executes program instructions that perform computations on the data stored in the memory, the microprocessor must fetch the data from memory into the microprocessor using the processor bus. Similarly, the microprocessor writes results of the computations back to the memory using the processor bus.
The time required to fetch data from memory or to write data to memory is typically between ten and one hundred times greater than the time required by the microprocessor to perform the computation on the data. Consequently, the microprocessor must inefficiently wait idle for the data to be fetched from memory.
To minimize this problem, modern microprocessors include a cache memory. The cache memory, or cache, is a memory internal to the microprocessor—typically much smaller than the system memory—that stores a subset of the data in the system memory. When the microprocessor executes an instruction that references data, the microprocessor first checks to see if the data is present in the cache and is valid. If so, the instruction can be executed immediately since the data is already present in the cache. That is, the microprocessor does not have to wait while the data is fetched from the memory into the cache using the processor bus. The condition where the microprocessor detects that the data is present in the cache and valid is commonly referred to as a cache hit.
Many cache hits occur due to the fact that commonly software programs operate on a relatively small set of data for a period of time, operate on another relatively small data set for another period of time, and so forth. This phenomenon is commonly referred to as the locality of reference principle. If the program exhibits behavior that substantially conforms to the principle of locality of reference and the cache size is larger than the data set size during a given period of time, the likelihood of cache hits is high during that period.
However, some software programs do not exhibit behavior that substantially conforms to the principle of locality of reference and/or the data sets they operate upon are larger than the cache size. These programs may require manipulation of a large, linear data set present in a memory external to the microprocessor, such as a video frame buffer or system memory. Examples of such programs are multimedia-related audio or video programs that process video data or audio wave file data. Typically, the cache hit rate is low for such programs.
To address this problem, some modern microprocessors include a prefetch instruction in their instruction sets. The prefetch instruction instructs the microprocessor to fetch a cache line specified by the prefetch instruction into the cache. A cache line is the smallest unit of data than can be transferred between the cache and other memories in the system, and a common cache line size is 32 or 64 bytes. The software programmer places prefetch instructions at strategic locations in the program to prefetch the needed data into the cache. Consequently, the probability is increased that the data is already in the cache when the microprocessor is ready to execute the instructions that perform computations with the data.
In some microprocessors, the cache is actually made up of multiple caches. The multiple caches are arranged in a hierarchy of multiple levels. For example, a microprocessor may have two caches, referred to as a first-level (L1) cache and a second-level (L2) cache. The L1 cache is closer to the computation elements of the microprocessor than the L2 cache. That is, the L1 cache is capable of providing data to the computation elements faster than the L2 cache. The L2 cache is commonly larger than the L1 cache, although not necessarily.
One effect of a multi-level cache arrangement upon a prefetch instruction is that the cache line specified by the prefetch instruction may hit in the L2 cache but not in the L1 cache. In this situation, the microprocessor can transfer the cache line from the L2 cache to the L1 cache instead of fetching the line from memory using the processor bus since the transfer from the L2 to the L1 is much faster than fetching the cache line over the processor bus. That is, the L1 cache allocates a cache line, i.e., a storage location for a cache line, and the L2 cache provides the cache line to the L1 cache for storage therein. The pseudo-code below illustrates a conventional method for executing a prefetch instruction in a microprocessor with a two-level internal cache hierarchy. In the code, a no-op denotes “no operation” and means that the microprocessor takes no action on the prefetch instruction and simply retires the instruction without fetching the specified cache line.
if (line hits in L1)
no-op; /* do nothing */
else if (line hits in L2)
supply requested line from L2 to L1;
else
fetch line from processor bus to L1;
Microprocessors include a bus interface unit (BIU) that interfaces the processor bus with the rest of the microprocessor. When functional blocks within the microprocessor want to perform a transaction on the processor bus, they issue a request to the BIU to perform the bus transaction. For example, a functional block within the microprocessor may issue a request to the BIU to perform a transaction on the processor bus to fetch a cache line from memory. It is common for multiple bus transaction requests to be pending, or queued up, in the BIU. This is particularly true in modern microprocessors because they execute multiple instructions in parallel through different stages of a pipeline, in a manner similar to an automobile assembly line.
A consequence of the fact that multiple requests may be queued up in the BIU is that a request in the queue must wait for all the other requests in front of it to complete before the BIU can perform that request. Consequently, if a bus transaction request is submitted to the BIU for a prefetch of a cache line, the possibility exists that the prefetch request may cause a subsequent request associated with a more important non-prefetch instruction to wait longer to be performed on the bus than it would otherwise have had to, thereby possibly degrading overall performance.
Commonly, a prefetch instruction is by definition a hint to prefetch the cache line rather than an absolute command to do so. That is, the microprocessor may choose to no-op the prefetch instruction in certain circumstances. However, conventional microprocessors do not consider the likelihood that performing a prefetch that generates additional processor bus activity will degrade performance. Therefore, what is needed is a microprocessor that selectively performs prefetch instructions based on this consideration.
SUMMARY
The present invention provides a microprocessor and method that compares a current level of bus activity with a predetermined threshold value as a prediction of future bus activity and selectively performs prefetch instructions based on the prediction. Accordingly, in attainment of the aforementioned object, it is a feature of the present
Henry G. Glenn
Hooker Rodney E.
Davis E. Alan
Huffman James W.
IP-First LLC
Nguyen T
LandOfFree
Microprocessor and method for performing selective prefetch... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Microprocessor and method for performing selective prefetch..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Microprocessor and method for performing selective prefetch... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3284765