Method and apparatus for assuring cache coherency

Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S147000, C710S053000, C710S056000, C710S052000

Reexamination Certificate

active

06295581

ABSTRACT:

TECHNICAL FIELD OF THE INVENTION
This invention relates generally to the use of cache memory in computer systems and more particularly to cache memory for image and video processing systems, database access systems, Computer Aided Design systems, and the like.
BACKGROUND OF THE INVENTION
Cache memory is used to optimize computer system performance by temporarily storing data in memory devices that allow for high speed access, in comparison to data retrieval from low speed memory such as disks or tapes. Cache memory is used to mirror the data in the low speed memory so that each access to the data is effected as an access to the high speed cache memory, thereby avoiding the latency associated with an access to the low speed memory. The initial access to the data incurs the latency time loss to access the data from the low speed memory, but once the data is stored in the cache, subsequent accesses to the data are via the high speed cache access. The cache is structured to mirror a block of memory, so that subsequent access to data in proximity to the initially accessed data is also via the high speed cache access. Cache memory is conventionally structured to provide access to multiple blocks of memory. As shown in
FIG. 1
, blocks C
0
, C
1
, C
2
, and C
3
form cache memory areas within a cache memory
130
.
FIG. 1
represents a conventional processing system with indexed cache memory. The client process
110
accesses data contained in the memory
100
via the memory access system
120
. The client process
110
communicates a stream of data commands
115
via the command bus
112
, and the data associated with the command stream
115
is communicated via the data bus
111
.
The memory access system
120
contains a cache memory
130
partitioned into cache locations C
0
, C
1
, C
2
, and C
3
. Each of these cache locations is capable of storing a copy of a block of memory A, B, C, etc. of memory
100
. The cache memory
130
has a speed of access, which is substantially greater than the speed of access of memory
100
. By storing copies of the blocks of memory
100
in the cache memory
130
, substantial access speed improvements can be achieved when multiple accesses to the data within a block occur.
The data commands from the client process
110
are received by the operation generator
140
within the memory access system
120
. The client data commands direct a transfer of data to or from a memory address, such as a read or write of data, or a combination, such as read-modify-write of data. The operation generator
140
generates a series of commands applicable to the memory control
160
and the memory
100
to accomplish each client data command. The operation generator interprets the data command to determine which memory block A, B, C, etc. of memory
100
includes the requested memory address. It also determines whether a copy of the identified memory block is already contained in the cache memory
130
. If the memory block is in the cache memory, the operation generator identifies which cache location C
0
, C
1
, etc. contains the copy of the memory block, and formulates a command to effect the data command with this identified cache location.
If the memory block is not contained in the cache memory, the operation generator allocates one of the cache locations to this memory block. Typically, the allocated cache location will have been allocated to another memory block prior to this data command. Therefore, the operation generator must determine whether some action must be taken with regard to the data currently stored in the identified cache location. If, for example, the copy of the data in the cache location had only been used for reading the data contained in a memory block, no action need be taken, and the new memory block data will merely overwrite the prior data. If, however, new data had been written to this cache location, intending to be written to the associated memory block, the copy of the data in the cache location must be written to the memory block before the new memory block data is read into this cache location. Thus, in this case, the operation generator will formulate a command to write the data in the cache location to its previously associated memory block, followed by the command to read the new memory block into this cache location. The command to write data from the cache location to the memory is termed a “flush” of the cache location; the command to read data into the cache location from the memory is termed a “fill” of the cache location.
When the cache memory is full and another request arrives, the operation generator allocates one of the cache locations to the new request. A variety of allocation algorithms can be applied to determine which cache location is to be reallocated, such as least recently used algorithms, indexed algorithms, and others. Before the operation generator reallocates one of the cache locations, it first determines that the data contained in the cache location is no longer needed. Typically, the data will be needed if it has been modified and the modifications have not been written back to memory. If the data has not been written back to the memory, the new data request cannot be processed in the cache location until the modified data has been written back to the memory. While this writing occurs, the processing of the data request is halted, which, depending on the nature of the data, may completely halt the processing of the computer system.
There are several techniques to minimize the occurrence of a processing halt. For example, in a pipeline process, memory access requests are provided a few stages ahead of when the data is needed. But, if the data is not available when it is to be processed, the process is halted until the data is available. By providing stages between the request and the data availability, the memory access system is provided time to obtain the data from the slower memory, and therefore, the likelihood of the client process having to be halted is reduced.
Another technique is to “spawn”, as sub-processes, current and subsequent commands before they are completed. The asynchronous nature of spawned processes, however, requires control in the sequencing of the spawned commands. Consider, for example, a command to flush modified data, followed by a command to fill from the same memory block. If the fill and flush commands are processed asynchronously and in parallel, the fill may occur before the flush. If the fill occurs before the flush, the modified data in the cache location will be overwritten by the data filled from memory, and will be incorrect. To avoid the potential errors caused by spawned processes, the commands and data must be processed in a coherent manner.
A direct means of assuring data consistency is to force a strict ordering of the sequencing of commands, and precluding the execution of a command until the preceding command has been completed. This purely sequential processing, however, is inefficient, because not all commands are dependent upon one another. For example, if a read command follows a flush command, there is no need to delay the execution of the read command until the flush command completes.
The processing of commands, even with dependency checks, must still occur sequentially, to avoid to memory deadlocks. That is, for example, when all the cache location are allocated and a new read request arrives, the dependency check will hold the read pending until one of the cache locations is flushed. The flushing of a cache location is held pending until the completion of the read or write requests to this cache location. Unless tight controls are placed on the ordering and processing of read, write, fill, and flush operations, the flushing of a cache location can become dependent upon the completion of a read request which is pending dependent upon the completion of this flushing, thereby resulting in a deadlock situation, precluding subsequent processing.
In a conventional cache system of
FIG. 1
, the command buffer
170
is a first-in first-out (FIFO) buffer, the

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for assuring cache coherency does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus for assuring cache coherency, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for assuring cache coherency will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2547229

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.