Electrical computers and digital processing systems: memory – Address formation – Generating prefetch – look-ahead – jump – or predictive address
Reexamination Certificate
2000-08-01
2004-08-31
Anderson, Matthew D. (Department: 2186)
Electrical computers and digital processing systems: memory
Address formation
Generating prefetch, look-ahead, jump, or predictive address
C711S137000, C712S207000
Reexamination Certificate
active
06785796
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to software control of memory access by a processing unit and, more particularly, to software controlled prefetching that effectively hides inherent memory access latency.
2. Background Art
Computer systems typically access data and/or program information from memory by utilizing the principles of temporal and spatial locality. Spatial locality, or locality in space, relates to the likelihood that, once a given entry is referenced, nearby entries will tend to be referenced in the near future. Temporal locality, or locality in time, relates to the likelihood that, once an entry is referenced, it will tend to be referenced again in the near future. To take advantage of these principles of locality, computer systems typically employ a hierarchical memory structure. This structure includes cache memory that is relatively small, fast, and local to the processor in addition to the larger, but slower, main memory. Some systems may include two or more levels of cache memory. The L2 cache, or second level of cache memory, may be located on the central processing unit (CPU) itself or on a separate integrated circuit chip, for example. The L1 cache, or first level of cache memory, is usually integrated within the CPU chip itself. Thus, in order to take advantage of the principles of locality, it is desirable to have the sought data in the cache, preferably the L1 on-chip cache, by the time the CPU makes its request for the entry.
When a memory access is requested, the system first checks the L1 on-chip cache, then the L2 cache (if present), then the main memory. While the technology used to implement the cache levels is typically static random access memory (SRAM), the technology used to implement the main memory is typically dynamic random access memory (DRAM). The DRAM cost per byte is substantially lower than the SRAM cost per byte and, as such, DRAM is the preferred choice for larger main memory systems. However, the DRAM access time is much longer than the associated cache memory access time. This results from the physical nature of the basic storage element that is a capacitor as well as the memory chip density and the overall main memory density. Given these constraints, a system that is able to manipulate the sought data access so that it is likely to be located in the local cache memory at the time that it is required by the CPU is capable of higher performance than a system that does no such explicit manipulation.
SUMMARY OF THE INVENTION
A method and apparatus for altering code to effectively hide main memory latency using software prefetching with non-faulting loads prefetches data from main memory into local cache memory at some point prior to the time when the data is requested by the CPU during code execution. The CPU then retrieves its requested data from local cache instead of directly seeing the memory latency. The non-faulting loads allow for safety and more flexibility in executing the prefetch operation earlier because they alleviate the concern of incurring a segmentation fault, particularly when dealing with linked data structures. Accordingly, the memory access latency that the CPU sees is essentially the cache memory access latency. Since this latency is much less than the memory latency resulting from a cache miss, the overall system performance is improved.
REFERENCES:
patent: 5778233 (1998-07-01), Besaw et al.
patent: 5822788 (1998-10-01), Kahn et al.
patent: 5948095 (1999-09-01), Arora et al.
patent: 6119218 (2000-09-01), Arora et al.
patent: 6253306 (2001-06-01), Ben-Meir et al.
patent: 0 729 103 (1996-08-01), None
International Preliminary Examination Report, PCT/US01/41511, filed Aug. 1, 2001.
“Complier-Based Prefetching for Recursive Data Structure”, Chi-Keung Luk and Todd C. Mowry, Dept. of Computer Science 1996 ACM pp. 222-233.
“An Effective Programmable Prefetch Engine for On-Chip Caches”, Tien-Fu Chen, Dept. of Computer Science Proceedings of Micro-28 1995 IEEE pp. 237-242.
“Effective Jump-Pointer Prefetching for Linked Data Structures”, Amir Roth and Gurinday S. Sohi Computer Sciences Dept. 1999 IEEE pp. 111-121.
“Dependence Based Prefetching for Linked Data Structures”, Amir Roth, Andreas Moshovos and Gurindar S. Sohi Computer Sciences Dept. 1998 ACM pp. 115-126
Damron Peter
Kosche Nicolai
Anderson Matthew D.
Ritchie David B.
Sun Microsystems Inc.
Thelen Reid & Priest LLP
Yeung Adrienne
LandOfFree
Method and apparatus for software prefetching using... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for software prefetching using..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for software prefetching using... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3360093