Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories
Reexamination Certificate
2000-09-29
2004-08-24
Lane, Jack A. (Department: 2186)
Electrical computers and digital processing systems: memory
Storage accessing and control
Hierarchical memories
C712S237000
Reexamination Certificate
active
06782454
ABSTRACT:
FIELD
The present invention relates generally to information and data storage communication and management, and more particularly to a system and method for prefetching data stored in a pointer linked data structure.
BACKGROUND
Modem computer systems generally include a central processing unit (CPU) or processor for processing data and a memory system for storing operating instructions and data. Typically, the speed at which the processor is able to decode and execute instructions exceeds the speed at which instructions and data are transferred between the memory system and the processor. Thus, the processor is often forced to wait for the memory system to respond. This delay is commonly known as memory latency. To reduce, if not eliminate, this time many computer systems now include a faster memory known as a cache memory between the processor and main-memory.
A cache memory reduces the memory latency period by temporarily storing a small subset of data from a lower-level memory such as a main-memory or mass-storage-device. When the processor needs information for an application, it first checks the cache. If the information is found in the cache (known as a cache-hit), the information will be retrieved from the cache and execution of the application will resume. If the information is not found in the cache (known as a cache-miss) then the processor will proceed to access the lower-level memories. Information accessed in the lower-level memories is simultaneously stored or written to the cache so that should the information be required again in the near future it is obtained directly from the cache, thereby reducing or eliminating any memory latency on subsequent read operations.
Use of a cache also reduces the memory latency period during write operations by writing to the cache. This reduces memory latency in two ways. First, it enables the processor to write at the much greater speed of the cache, and second, storing or loading the data into enables it to be obtained directly from the cache should the processor need the data again in the near future.
Caches rely on principles of temporal and spacial-locality to speed up memory access operations. These principles of locality are based on the assumption that, in general, a computer program accesses only a relatively small portion of the information available in computer memory in a given period of time. In particular, temporal locality holds that if some information is accessed once, it is likely to be accessed again soon, and spatial locality holds that if one memory location is accessed then other nearby memory locations are also likely to be accessed. Thus, in order to exploit temporal-locality, caches temporarily store information from a lower-level memory the first time it is accessed so that if it is accessed again soon it need not be retrieved from the lower-level memory. To exploit spatial-locality, caches transfer several blocks of data from contiguous addresses in lower-level memory, besides the requested block of data, each time data is written to the cache from lower-level memory.
Another method of reducing memory latency is commonly known as prefetching. Prefetching involves identifying memory access operations likely to result in a cache-miss and generating and executing prefetch instructions to store data from main-memory into the cache in advance of the time when they will actually be needed by the processor. Various approaches and mechanisms have been tried in an attempt to predict the processor's need ahead of time. One simple approach exploits the principle of spatial locality by prefetching a block of data immediately following that last referenced by the processor.
While a significant improvement over cache systems without prefetching all of the prior art prefetching mechanisms suffer from a common short coming. All prior art prefetching mechanisms are embedded in the hardware or firmware of the cache controller and provide no accommodating characteristics for the program being executed or the data structure, but rather retrieve a set amount of data from a set range in memory. Thus, while conventional prefetching mechanisms may work well in prefetching data stored sequentially, such as in an array where elements are stored in contiguous portions of memory, they do not work with pointer linked data structures, such as lists or trees. Pointer linked data structures consist of a number of nodes each containing data and a link which is an address of the next node. Typically, the individual nodes are allocated separately and are widely scattered throughout main-memory. Therefore, there is a need for a system and method for prefetching data in a pointer linked data structure.
SUMMARY
The present invention provides a system and method for efficiently prefetching data in a pointer linked data structure.
In one aspect, the present invention provides a data processing system including a processor capable of executing a program, a main-memory and a prefetch engine configured to prefetch data from a plurality of locations in main-memory in response to a prefetch request from the processor. When the data in the main-memory has a linked-data-structure having a plurality of nodes each with data stored therein, the prefetch engine is configured to traverse the linked-data-structure and prefetch data from the nodes. Generally, the prefetch request includes a starting address of a first node to be prefetched, an offset value, and a termination value. The prefetch engine is configured to determine from data contained in a prefetched first node at the offset value from the address of the first node, an address for a second node to be prefetched.
In one embodiment the termination value is, for example, an end address, and the prefetch engine is configured to compare the address of the last node to be prefetched to the termination value to determine whether the prefetch request has been satisfied. Alternatively, where the termination value is a number of nodes to be prefetched, the prefetch engine is configured to count the number of nodes prefetched as they are prefetched and to compare the number of nodes prefetched to the termination value.
In another embodiment, the prefetch engine includes a number of sets of prefetch registers, one set of prefetch registers for each prefetch request from the processor that is yet to be completed. Each set of prefetch registers includes (i) a prefetch address register; (ii) an offset register; (iii) a termination register; (iv) a status register; and (v) a returned data register.
In yet another embodiment, the data processing system includes a cache capable of storing data transferred between the processor and a main-memory, and the prefetch engine is configured to write the prefetched data from the node to the cache. In one version of this embodiment, the data processing system further includes a cache controller configured to store data in the cache, and the prefetch engine is configured to issue a prefetch instruction to the cache controller.
In another aspect, the present invention is directed to a method of prefetching data in a data processing system having a pointer linked data structure with a number of nodes, each with data stored therein. In the method, a prefetch request from a processor is received in a prefetch engine, the prefetch request including a starting address of a node, an offset value and a termination value. Data in the node is prefetched. It is determined whether a termination condition has been met and the prefetch request satisfied. If not, the offset and the starting address are used to load a new starting address from the prefetched data. That is the address indicated by the sum of the starting address and the offset holds the new starting address for the next node to be prefetched. The process is repeated until the termination condition is met.
In one embodiment, the termination value is an end address, in which case the step of determining whether the termination condition has been met involves comparing the address of the last node from which data wa
Baker Paul
Lane Jack A.
Martine & Penilla LLP
Sun Microsystems Inc.
LandOfFree
System and method for pre-fetching for pointer linked data... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with System and method for pre-fetching for pointer linked data..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for pre-fetching for pointer linked data... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3341652