Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories
Reexamination Certificate
1999-11-09
2002-10-22
Yoo, Do Hyun (Department: 2187)
Electrical computers and digital processing systems: memory
Storage accessing and control
Hierarchical memories
C711S122000, C712S207000
Reexamination Certificate
active
06470427
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention generally relates to computer systems, and more specifically to an agent and method for managing queued prefetch requests between two levels of a memory hierarchy in a computer system. In particular, the present invention makes more efficient use of a cache hierarchy by providing a separate agent to perform prefetch requests.
2. Description of Related Art
The basic structure of a conventional computer system includes one or more processing units connected to various input/output devices for the user interface (such as a display monitor, keyboard and graphical pointing device), a permanent memory device (such as a hard disk, or a floppy diskette) for storing the computer's operating system and user programs, and a temporary memory device (such as random access memory or RAM) that is used by the processor(s) in carrying out program instructions. The evolution of computer processor architectures has transitioned from the now widely-accepted reduced instruction set computing (RISC) configurations, to so-called superscalar computer architectures, wherein multiple and concurrently operable execution units within the processor are integrated through a plurality of registers and control mechanisms.
The objective of superscalar architecture is to employ parallelism to maximize or substantially increase the number of program instructions (or “micro-operations”) simultaneously processed by the multiple execution units during each interval of time (processor cycle), while ensuring that the order of instruction execution as defined by the programmer is reflected in the output. For example, the control mechanism must manage dependencies among the data being concurrently processed by the multiple execution units, and the control mechanism must ensure the integrity of data that may be operated on by multiple processes on multiple processors and potentially contained in multiple cache units. It is desirable to satisfy these objectives consistent with the further commercial objectives of increasing processing throughput, minimizing electronic device area and reducing complexity.
Both multiprocessor and uniprocessor systems usually use multi-level cache memories where typically each higher level is smaller and has a shorter access time. The cache accessed by the processor, and typically contained within the processor component of present systems, is typically the smallest cache.
Both data and instructions are cached, and data and instruction cache entries are typically loaded before they are needed by operation of prefetch units and branch prediction units. Groups of instructions, called “streams”, associated with predicted execution paths can be detected and loaded into cache memory before their actual execution. Likewise data patterns can be predicted by stride detection circuitry and loaded before operations requiring the data are executed.
Although branch prediction and stride analysis can provide fairly complete availability within the cache connected to a processor, cache faults can still occur, when a value required by a processor has not been preloaded into the highest level of cache. These requests are always labeled demand requests, as they are needed by a processor. In addition, requests for values that will be needed by a processor are generated as demand requests, even though the requests are not for values immediately required by a processor.
If prefetch requests are attached to demand requests to load values that are predicted to be needed, these requests compete with the demand requests going to a cache controller. In addition, cache controller complexity is increased if the predictions are made within the cache itself or the cache has to distinguish between prefetch requests and demand requests. The treatment of large quantities of predicted prefetch requests can also overload the capabilities of the cache, and depending on the type of application being executed, ideal handling of prefetch requests and overloading will vary.
In light of the foregoing, it would be desirable to provide a method of improving prefetch handling by computer systems which will speed up core processing. It would be further advantageous if the method and apparatus allowed dynamic adjustment of prefetch request handling.
SUMMARY OF THE INVENTION
It is therefore one object of the present invention to provide an. improved processor for a computer system, having a prefetch unit.
It is another object of the present invention to provide a computer system using such a processor, which also has a prefetch unit.
It is yet another object of the present invention to provide a computer system and processor that provide more efficient loading of prefetches.
The foregoing objects are achieved in a method and apparatus for loading prefetch values into a memory subsystem having associated prefetch queues, wherein when at least one of said prefetch queues are busy, a response to a demand request is selected based upon a programmed value. The programmed response may be one of: ignoring the new demand request, causing the demand request to retry, or flushing a pending prefetch entry and replacing it with an entry associated with the demand request. The response may vary with a phase of the bus transaction, allowing for a programmed ignore, flush or retry response selection for each of three phases delineated by: the receipt of a response to a prior demand request, the receipt of a response to a set of prefetches associated with a prior demand request, and a phase in which prefetches associated with a prior demand request are being retried.
The above as well as additional objectives, features, and advantages of the present invention will become apparent in the following detailed written description.
REFERENCES:
patent: 4918587 (1990-04-01), Pechter et al.
patent: 5367656 (1994-11-01), Ryan
patent: 5588128 (1996-12-01), Hicok et al.
patent: 5694568 (1997-12-01), Harrison, III et al.
patent: 5790823 (1998-08-01), Puzak et al.
patent: 5802566 (1998-09-01), Hagersten
patent: 5802569 (1998-09-01), Genduso
patent: 5809320 (1998-09-01), Jain
patent: 5848432 (1998-12-01), Hotta
patent: 5953512 (1999-09-01), Cai et al.
patent: 6047363 (2000-04-01), Lewchuk
patent: 6138212 (2000-10-01), Chiacchia
patent: 6173392 (2001-01-01), Shinozaki
patent: 6272619 (2001-08-01), Nguyen et al.
U.S. patent application Ser. No. 09/436,372, Arimilli et al., filed Nov. 9, 1999.
U.S. patent application Ser. No. 09/052,567, Burky et al., filed Mar. 31, 1998.
Kim, Sunil, et al., “Stride-directed Prefetching for Secondary Caches”, IEEE, 1997, pp. 314-321.
Fu, John W.C.,et al., “Stride Directed Prefetching in Scalar Processors”, IEEE, 1992, pp. 102-110.
Dahlgren, Fredrik, et al., “Effectiveness of Hardware-Based Stride and Sequential Prefetching in Shared-Memory Multiprocessors”, IEEE, 1995, pp. 68-77.
Arimilli Ravi Kumar
Dodson John Steven
Fields, Jr. James Stephen
Guthrie Guy Lynn
McLean Kimberly N.
Salys Casimer K.
Yoo Do Hyun
LandOfFree
Programmable agent and method for managing prefetch queues does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Programmable agent and method for managing prefetch queues, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Programmable agent and method for managing prefetch queues will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2989699