Pipelined cache memory deallocation and storeback

Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S118000, C711S133000, C711S140000, C711S159000

Reexamination Certificate

active

06298417

ABSTRACT:

BACKGROUND
1. Field of the Present Invention
The present invention generally relates to cache memory systems and more particularly to method and circuit for reducing latencies associated with copyback transactions in cache memory subsystems that employ multiple byte cache lines.
2. History of Related Art
Microprocessor based computer systems are typically implemented with a hierarchy of memory subsystems designed to provide an appropriate balance between the relatively high cost of fast memory subsystems and the relatively low speed of economical subsystems. Typically, the fastest memory subsystem associated with a computer system is also the smallest and most expensive. Because the hit rate of any given cache memory subsystem is a function of the size of the subsystem, the smallest and fastest memory subsystems typically have the highest miss rate. To achieve optimal performance, many computer systems implement a copyback policy in which data written by the system's microprocessor is initially stored in the cache. The cache data is then typically written back to system memory at a later time by a memory control unit. In this manner, the number of time consuming accesses to system memory that must be made by the processor is greatly reduced. The performance enhancement achieved by a copyback cache policy comes at the cost of increased bus bandwidth required to maintain cache/system memory coherency. In addition, microprocessors are increasingly utilized in multi-tasking systems to carry out processor intensive applications that result in unprecedented cache traffic and the generation of relatively frequent cache miss transactions. Thus, performance problems arising from multiple pending cache miss events are becoming increasingly more common.
A cache miss occurs when a bus master such as the microprocessor is required to read information from or write information to a location in system memory that is not presently reproduced in the cache memory subsystem. Cache miss transactions in copyback cache architectures can have a greater latency due to the system overhead required to transfer the contents of the cache subsystem associated with the cache miss event to system memory prior to completing the pending transaction. This overhead can increase as the line size of the cache memory subsystem increases because more clock cycles will be required to fully transfer the contents of a dirty or modified cache line to an appropriate storage location before filling the cache line with the data associated with the cache miss. Unfortunately, long cache lines are frequently encountered to reduce the circuitry required to implement a cache tag RAM to take advantage of the memory reference locality and to take advantage of special multiple byte transfer cycles such as burst write and burst read cycles designed into many modem memory devices. Accordingly, it would be advantageous to provide a method and circuit to improve the efficiency with which multiple pending cache miss transactions are handled in a copyback cache architecture.
SUMMARY OF THE INVENTION
The problems identified above are in large part addressed by a deallocation pipelining circuit for use with cache memory systems incorporating multiple byte cache lines. By pipelining the transfer of modified cache data to a storeback buffer and by pipelining the transfer from the storeback buffer to a bus interface unit, the present invention introduces an efficient and practical circuit and method for reducing latency caused by multiple cycle storeback transactions.
Broadly speaking, the present invention contemplates a method of deallocating a cache memory line. In a first line transfer, first line data is copied from a first line of the cache to a buffer, such as a storeback buffer, in response to a cache miss that initiates a deallocation of the first line. The first line data is then copied, in a first storeback transfer that is responsive to the first line transfer, from the buffer to backing memory, such as a system memory or a higher level cache. In response to the first storeback transfer, the first cache line is deallocated before the first storeback transfer completes. In this manner, a pending fill of the first line begins before the first line data is fully transferred to the backing memory.
Preferably, the cache miss initiates the deallocation of a cache line corresponding to the cache miss if the corresponding cache line includes modified data, which is preferably indicated by at least one status bit corresponding to the cache line. In one embodiment, the storeback transfer includes an interim transfer to a bus interface unit where the storeback data resides until the bus interface unit transfers the data to the backing memory. This embodiment may be suitably implemented by issuing a storeback request signal to the bus interface unit as part of first storeback transfer. In one embodiment, the first storeback transfer further includes the bus interface unit sending a data acknowledge signal and a request acknowledge signal responsive to the storeback request.
In one embodiment, the storeback buffer includes first and second segments. In this embodiment, the first storeback transfer includes a first portion during which the first segment is copied to and a second portion during which the second segment is copied. The first portion may precede or follow the second portion. Data from a second line in the cache is then copied during a second line transfer. This second line transfer is responsive to the earlier portion of the first storeback transfer completing. In this manner, the second line transfer begins before the first storeback transfer completes thereby reducing latency that results from waiting for the entire storeback transfer to complete.
The present invention further contemplates a deallocation pipelining circuit of a cache memory subsystem. The deallocation pipelining circuit is configured to initiate copying of first line data stored in a first cache line of a cache memory array to a buffer during a first line transfer. The first line transfer is suitably responsive to a cache miss that initiates a deallocation of the first line. The circuit is further configured to initiate the copying of the first line data from the buffer to a backing memory in a first storeback transfer that is responsive to the first line transfer. The circuit is configured to deallocate the first cache line in response to the first storeback transfer, such that a pending fill of the first cache line may begin before the first storeback transfer completes.
In one embodiment, the circuit is configured to detect the cache miss and a modification status of the first line. In this embodiment, the circuits is preferably configured to initiate the first line transfer if the cache miss corresponds to the first line and the first line is modified. In one embodiment, the circuit is configured to issue a storeback request signal to a bus interface unit in response to the first line transfer, receive a data acknowledge signal from the bus interface unit in response to the storeback request signal, and initiate the first storeback transfer in to the data acknowledge signal. In one embodiment, the circuit may be further configured to receive a request acknowledge signal from the bus interface unit initiating the deallocation of the first line.
The buffer might suitably include first and second (or more) segments and the first storeback transfer might suitably including a first portion copying the first segment and a second portion copying the second segment. The first portion may either precede or follow the second portion. In such an embodiment, the circuit is preferably configured to initiate a second line transfer comprised of copying data from a second cache line to the buffer in response to the earlier portion of the first storeback transfer completing. Accordingly, the second line transfer may begin before the second portion of the first storeback transfer completes. This embodiment is suitably implemented wherein the deallocation pipelining cir

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Pipelined cache memory deallocation and storeback does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Pipelined cache memory deallocation and storeback, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Pipelined cache memory deallocation and storeback will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2595110

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.