Electrical computers and digital processing systems: memory – Address formation – Address mapping
Reexamination Certificate
2001-11-30
2004-01-06
Sparks, Donald (Department: 2187)
Electrical computers and digital processing systems: memory
Address formation
Address mapping
C711S213000, C711S217000
Reexamination Certificate
active
06675280
ABSTRACT:
FIELD OF THE INVENTION
The invention relates generally to processors and, more particularly, to a method and apparatus for content-aware prefetching.
BACKGROUND OF THE INVENTION
A conventional processor typically operates at a much faster speed than the main memory to which the processor is coupled. To overcome the inherent latency of main memory, which usually comprises dynamic random access memory (DRAM), a memory hierarchy is employed. The memory hierarchy includes one or more levels of cache, each cache comprising a relatively fast memory device or circuitry configured to hold data recently accessed—or expected to be accessed—by the processor. The purpose of the cache is to insure most data needed by a processor is readily available to the processor without accessing the main memory, as the process of accessing main memory is very slow in comparison to the speed of the processor or the speed at which the processor can access a cache.
Typically, a memory hierarchy comprises multiple levels of cache, wherein each level is faster than next lower level and the level closest to the processor exhibits the highest speed and performance. A cache may be located on the processor itself—i.e., an “on-chip” cache—or a cache may comprise an external memory device—i.e., an “off-chip” cache. For example, a processor may include a high level on-chip cache—often times referred to as an “L1” cache—wherein the processor is coupled with a lower level off-chip cache—which is often referred to as an “L2” cache. Alternatively, a processor may include an on-chip L1 cache, as well as an on-chip L2 cache. Of course, a memory hierarchy may include any suitable number of caches, each of the caches located on-chip or off-chip.
As noted above, each level of cache may hold data recently accessed by the processor, such recently accessed data being highly likely—due to the principles of temporal and spatial locality—to be needed by the processor again in the near future. However, system performance may be further enhanced—and memory latency reduced—by anticipating the needs of a processor. If data needed by a processor in the near future can be predicted with some degree of accuracy, this data can be fetched in advance—or “prefetched”—such that the data is cached and readily available to the processor. Generally, some type of algorithm is utilized to anticipate the needs of a processor, and the value of any prefetching scheme is dependent upon the degree to which these needs can be accurately predicted.
One conventional type of prefetcher is commonly known as a “stride” prefetcher. A stride prefetcher anticipates the needs of a processor by examining the addresses of data requested by the processor—i.e., a “demand load”—to determine if the requested addresses exhibit a regular pattern. If the processor (or an application executing thereon) is stepping through memory using a constant offset from address to address—i.e., a constant stride—the stride prefetcher attempts to recognize this constant stride and prefetch data according to this recognizable pattern. Stride prefetchers do, however, exhibit a significant drawback. A stride prefetcher does not function well when the address pattern of a series of demand loads is irregular—i.e., there is not a constant stride—such as may occur during dynamic memory allocation.
Another method of data prefetching utilizes a translation look-aside buffer (TLB), which is a cache for virtual-to-physical address translations. According to this method, the “fill contents”—i.e., the requested data—associated with a demand load are examined and, if an address-sized data value matches an address contained in the TLB, the data value likely corresponds to a “pointer load”—i.e., a demand load in which the requested data is an address pointing to a memory location—and is, therefore, deemed to be a candidate address. A prefetch request may then be issued for the candidate address. Because the contents of the requested data—as opposed to addresses thereof—are being examined, this method may be referred to as content-based, or content-aware, prefetching. Such a content-aware prefetching scheme that references the TLB (or, more generally, that references any external source or index of addresses) has a significant limitation: likely addresses are limited to those cached in the TLB, and this constraint significantly reduces the number of prefetch opportunities. Also, this content-aware prefetching scheme requires a large number of accesses to the TLB; thus, additional ports must be added to the TLB to handle the content prefetcher overhead.
REFERENCES:
patent: 4980823 (1990-12-01), Liu
patent: 5317718 (1994-05-01), Jouppi
patent: 5357618 (1994-10-01), Mirza et al.
patent: 5423014 (1995-06-01), Hinton et al.
patent: 5500948 (1996-03-01), Hinton et al.
patent: 5664147 (1997-09-01), Mayfield
patent: 5666505 (1997-09-01), Bailey
patent: 5694568 (1997-12-01), Harrison, III et al.
patent: 5701448 (1997-12-01), White
patent: 5724422 (1998-03-01), Shang et al.
patent: 5740399 (1998-04-01), Mayfield et al.
patent: 5752037 (1998-05-01), Gornish et al.
patent: 5758119 (1998-05-01), Mayfield et al.
patent: 5764946 (1998-06-01), Tran et al.
patent: 5765214 (1998-06-01), Sywyk
patent: 5778423 (1998-07-01), Sites et al.
patent: 5991848 (1999-11-01), Koh
patent: 6012135 (2000-01-01), Leedom et al.
patent: 6055622 (2000-04-01), Spillinger
patent: 6076151 (2000-06-01), Meier
patent: 6079005 (2000-06-01), Witt et al.
patent: 6081479 (2000-06-01), Ji et al.
patent: 6085291 (2000-07-01), Hicks et al.
patent: 6092186 (2000-07-01), Betker et al.
patent: 6098154 (2000-08-01), Lopez-Aguado et al.
patent: 6119221 (2000-09-01), Zaiki et al.
patent: 6131145 (2000-10-01), Matsubara et al.
patent: 6138212 (2000-10-01), Chiacchia et al.
patent: 6161166 (2000-12-01), Doing et al.
patent: 6212603 (2001-04-01), McInerney et al.
patent: 6275918 (2001-08-01), Burkey et al.
patent: 6292871 (2001-09-01), Fuente
patent: 6295594 (2001-09-01), Meier
Hans-Juergen Boehm,“Hardware and Operationg System Support for Conservative Garbage Collection”, Xerox PARC, Palo Alto, CA, 1991 IEEE, pp. 61-67.
Mark J. Charney, et al., “Generalized Correlation-Based Hardware Prefetching”, School of Electrical Engineering, Cornell University, Ithaca, NY, Technical Report No. EE-CEG-95-1, Feb. 13, 1995, pp. 1-45.
Tien-Fu Chen, et al., “Reducing Memory Latency Via Non-Blocking and Prefetching Caches”, Department of Computer Science and Engineering, University of Washington, Seattle, WA, 1992, pp. 51-61.
Doug Joseph, et al., “Prefetching Using Markov Predictors”, IBM T.J. Watson Research Lab, Yorktown Heights, NY, 1997, pp. 252-263.
Norman P. Jouppi, “Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers”, Digital Equipment Corporation Western Research Lab, Palo Alto, CA, 1990 IEEE, pp. 364-373.
Mikko H. Lipasti, et al., “SPAID: Software Prefectching in Pointer-and Call-Intensive Environments”, IBM Corporation, Rochester, MN, 1995 IEEE, pp. 231-236.
Chi-Keung Luk, et al., “Compiler-Based Prefetching for Recursive Data Structures”, Department of Computer Science, Department of Electrical and Computer Engineering, University of Toronto, Toronto, Canada, 1996, pp. 222-233.
Todd C. Mowry, et al., “Design and Evaluation of a Compiler Algorithm for Prefetching”, Computer Systems Laboratory, Stanford University, CA, 1992, pp. 62-73.
Toshihiro Ozawa, et al., “Cache Miss Heuristics and Preloading Techniques for General-Purpose Programs”, Fujitsu Laboratories Ltd, Kawasaki, Japan, 1995 IEEE, pp. 243-248.
Subbarao Palacharla, et al., “Evaluating Stream Buffers as a Secondary Cache Replacement”, Computer Sciences Department, University of Wisconsin-Madison, WI, 1994 IEEE, pp. 24-33.
Amir Roth, et al., “Dependence Based Prefetching for Linked Data Structures”, Computer Sciences Department, University of Wisconsin, Madison, WI, 1998, pp. 115-126.
Chia-Lin Yang, et al., “Push vs. Pull: Data Movement for Linked Data Structures”, Department of Computer Science, Duke University, Durhan, NC, 2000, pp. 176-1
Cooksey Robert N.
Jourdan Stephan J.
Blakely , Sokoloff, Taylor & Zafman LLP
Chace Christian P.
Sparks Donald
LandOfFree
Method and apparatus for identifying candidate virtual... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for identifying candidate virtual..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for identifying candidate virtual... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3210358