Method and architecture for data coherency in...

Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S123000

Reexamination Certificate

active

06243791

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates generally to caches in computer architectures and more specifically to multiple cache sets with specialized functionality.
2. Description of the Related Art
Microprocessor systems include various types of memory which store the instructions and data by which the microprocessor operates. The memory is organized along the lines of a general hierarchy which is illustrated in FIG.
1
. The hierarchy is organized in order of increasing memory access time with the memory level having the fastest access time being positioned relatively closer to the central processing unit (CPU) of the microprocessor system. Registers are the fastest memory devices and are generally internal architecture units within the microprocessor. Toward the middle level is main memory which is typically constructed using semiconductor memory devices, such as random access memory (RAM) chips, which are directly accessed by the microprocessor through an external bus. Mass storage represents relatively large amounts of memory that are not directly accessed by the microprocessor, such as magnetic disks or CDROM, and which is typically much slower to access than main memory. Archival storage represents long-term memory which typically requires human intervention for access, such as the loading of a magnetic tape.
In addition, microprocessor systems typically include cache memory at a level in between the registers and main memory which contains copies of frequently used locations in main memory. For each entry in a cache memory, there is a location to store the data and a tag location that identifies the corresponding location in main memory with which the data is associated. When the microprocessor outputs an address value on the memory bus at the beginning of a data access cycle, the address value is compared to the tags in cache memory to determine whether a match exists. A match of an address value to a cache tag is called a cache hit and the data is accessed in cache rather than main memory.
Cache memory is relatively small and fast as compared to main memory, but is also more expensive on a per bit basis. When a microprocessor can operate at higher speeds than main memory, then processor cycles can be saved and performance improved by including cache in the memory hierarchy of the microprocessor subsystem. To improve performance and reduce cost, the local memory in a microprocessor typically includes one or more cache devices.
FIG. 2A
illustrates an example of a conventional microprocessor
10
whose local memory includes a cache
50
and main memory
80
. In the course of operation of microprocessor
10
, small portions of data from main memory
80
are moved into cache
50
for fast access by CPU
20
via CPU data bus
22
. Subsequent accesses by CPU
20
to the same data are made to the cache
50
rather than main memory
80
. A cache controller
30
monitors the data accesses made by CPU
20
and determines whether the desired data is resident in cache
50
, main memory
80
such as CD-ROMs or mass storage disks or in other storage devices. The cache controller
30
also moves data between cache
50
and main memory
80
such as based upon the data accesses requested by CPU
20
and the cache replacement policy designed into the cache controller. There is overhead time associated with the data management activities of cache controller
30
, but, ideally, the cache overhead is outweighed by the advantage gained from the lower access time of the cache devices.
Typically, cache controller
30
is connected to main memory
80
via a main memory data bus
82
and a separate cache data bus
32
which connects it to cache
50
. In response to a data access from CPU
20
, the cache controller
30
will generally attempt to find the data in cache
50
. If the data is not found in cache
50
, i.e. a cache miss occurs in cache
50
and is communicated back to cache controller
30
, then cache controller
30
will attempt to find the data in main memory
80
. CPU
20
can also be configured to perform a cache bypass memory access wherein the CPU sends a bypass control directive to cache controller
30
which causes the data access to go directly to main memory
80
to find the data thereby bypassing cache
50
.
Microprocessors are sometimes designed with multiple sub-layers of cache, as is also illustrated in FIG.
2
A. Cache
50
is divided into a first level cache
52
and a second level cache
56
. The first level cache
52
will typically be a smaller, faster and more expensive per bit device than the larger second level cache
56
. The first level cache will also typically maintain data at a finer level of granularity than second level cache
56
. Cache devices are typically arranged in terms of lines, where the line is one or more data words and is the unit of data brought in on a miss. Thus, the first level cache may have a line length of just one or two words, while the second level cache will have a line length on the order of eight or sixteen words. In the multiple level cache structure, cache controller
30
controls the population and replacement of data between the two levels of cache and main memory.
Caches are typically designed to exploit temporal and spatial locality in the program under execution by the microprocessor
10
. Temporal locality is the tendency of programs to access a piece of data multiple times within a relatively short interval of time. By moving the piece of data from main memory
80
to cache
50
, the microprocessor can take advantage of temporal locality to reduce the time required to access the piece of data for later accesses. Spatial locality is the tendency of programs to make subsequent accesses to data which is located nearby the data which has recently been accessed, i.e. an access to one portion of a block or line of data will likely be followed by accesses to other portions of the same block or line of data.
However, different types of data can exhibit highly divergent access characteristics. For instance, some types of data, such as image or audio data, get processed by walking through the data once without repetitive access. This highly spatial data also tends to be in the form of blocks or pages of relatively large size. As the spatial data is sequentially accessed by CPU
10
, cache controller
30
will stream the spatial data into cache
50
thereby replacing the data already present in the cache. Streaming in a block of this spatial data tends to occupy the entire cache space with data which will not be subsequently accessed or will not be accessed for a significant period of time. Other data which would have been beneficial to keep in cache is thus flushed out and the efficiency and efficacy of the cache function is undermined.
As a simple example, consider the case where the size of cache
50
is 32 Kbytes and the block size of some highly spatial data is 16 Kbytes. Access to a first block of spatial data will overwrite ½ (i.e. 16 Kbytes divided by 32 Kbytes) of the contents of cache
50
. The first block of spatial data is likely to be retained based upon a cache replacement policy which assumes temporal locality, even though the first block may not be accessed again or may not be accessed for a significant period of time. Access to a second block of spatial data then causes the remaining ½ of the contents of cache
50
to be overwritten. Thus, by accessing two blocks of spatial data, cache
50
is completely flushed of its previous contents.
The cache flushing problem is quite pronounced for specialized data types having very large block sizes. For instance, image data commonly has block sizes of 512 Kbytes to 100 Mbytes. Each block not only flushes other data from the cache, but also flushes their own lines of data when the block size is larger than the cache size. Another example of the cache flushing problem arises with regard to the tables that are used in processing an image stream. The tables will typically be replaced by the data of the image stream unless separa

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and architecture for data coherency in... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and architecture for data coherency in..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and architecture for data coherency in... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2463865

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.