System and method for a software controlled cache

Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

06668307

ABSTRACT:

FIELD
The present invention relates generally to memory systems, and more particularly to cache memory systems and a method of operating the same that provides efficient handling of data.
BACKGROUND
Modern computer systems generally include a central processing unit (CPU) or processor for processing data and a memory system for storing operating instructions and data. Typically, the speed at which the processor can decode and execute instructions exceeds the speed at which instructions and data can be transferred between the memory system and the processor. Thus, the processor is often forced to wait for the memory system to respond. This delay is commonly known as memory latency. To reduce, if not eliminate, this time many computer systems now include a faster memory known as a cache memory between the processor and main-memory.
A cache memory reduces the memory latency period by temporarily storing a small subset of data from a lower-level memory such as a main-memory or mass-storage-device. When the processor needs information for an application, it first checks the cache. If the information is found in the cache (known as a cache-hit), the information will be retrieved from the cache and execution of the application will resume. If the information is not found in the cache (known as a cache-miss) then the processor will proceed to access the lower-level memories. Information accessed in the lower-level memories is simultaneously stored or written to the cache so that should the information be required again in the near future it can be obtained directly from the cache, thereby reducing or eliminating any memory latency on subsequent read operations.
Use of a cache can also reduce the memory latency period during write operations by writing to the cache. This reduces memory latency in two ways. First, it enables the processor to write at the much greater speed of the cache, and second, storing or loading the data into the cache enables it to be obtained directly from the cache should the processor need the data again in the near future.
Typically, the cache is divided logically into two main components or functional units. A data-store, where the cached information is actually stored, and a tag-field, a small area of memory used by the cache to keep track ofthe location in the memory where the associated data can be found. The data-store is structured or organized as a number of cache-lines each having a tag-field associated therewith, and each capable of storing multiple blocks of data. Typically, in modern computers each cache-line stores 32 or 64 blocks or bytes of data. The tag-field for each cache-line includes an index that uniquely identifies each cache-line in the cache, and a tag that is used in combination with the index to identify an address in lower-level memory from which data stored in the cache-line has been read from or written to. The tag-field for each cache-line also includes one or more bits, commonly known as a validity-bit, to indicate whether the cache-line contains valid data. In addition, the tag-field may contain other bits, for example, for indicating whether data at the location is dirty, that is has been modified but not written back to lower-level memory.
To speed up memory access operations, caches rely on principles of temporal and spacial-locality. These principles of locality are based on the assumption that, in general, a computer program accesses only a relatively small portion of the information available in computer memory in a given period of time. In particular, temporal locality holds that if some information is accessed once, it is likely to be accessed again soon, and spatial locality holds that if one memory location is accessed then other nearby memory locations are also likely to be accessed. Thus, in order to exploit temporal-locality, caches temporarily store information from a lower-level memory the first time it is accessed so that if it is accessed again soon it need not be retrieved from the lower-level memory. To exploit spatial-locality, caches transfer several blocks of data from contiguous addresses in lower-level memory, besides the requested block of data, each time data is written to the cache from lower-level memory.
The most important characteristic of a cache is its hit rate, that is the fraction of all memory accesses that are satisfied from the cache over a given period of time. This in turn depends in large part on how the cache is mapped to addresses in the lower-level memory. The choice of mapping technique is so critical to the design of the cache that the cache is often named after this choice. There are generally three different ways to map the cache to the addresses in memory, direct mapping, fully-associative and set-associative.
Direct-mapping, is the simplest way to map a cache to addresses in main-memory. In the direct-mapping method the number of cache-lines is determined, the addresses in memory divided into the same number of groups of addresses, and addresses in each group associated with one cache-line. For example, for a cache having 2
n
cache-lines, the addresses in memory are divided into 2
n
groups and each address in a group is mapped to a single cache-line. The lowest n address bits of an address corresponds to the index of the cache-line to which data from the address can be stored. The remaining top address bits are stored as a tag that identifies from which of the several possible addresses in the group the data in the cache-line originated. For example, to map a 64 megabyte (MB) main-memory to a 512 kilobyte (KB) direct-mapped cache having 16,384 cache-lines, each cache-line is shared by a group of 4,096 addresses in main-memory. To address 64-MB of memory requires 26 address bits since 64-MB is 2
26
bytes. The lowest five of these address bits, A
0
to A
4
, are ignored in the mapping process, although the processor will use them later to determine which of the 32 blocks of data in the cache-line to accesses. The next 14 address bits, A
5
to A
18
, provide the index of the cache-line to which the address is mapped. Because any cache-line can hold data from any one of 4,096 possible addresses in main-memory, the next seven highest address bits, A
19
to A
25
, are used as a tag to identify to the processor which of the addresses the cache-line holds data from. This scheme, while simple, can result in a cache-conflict or thrashing in which a sequence of accesses to memory repeatedly overwrites the same cache entry, resulting in a cache-miss on every access. This can happen, for example, if two blocks of data, which are mapped to the same set of cache locations, are needed simultaneously.
A fully-associative mapped cache avoids the cache-conflict ofthe directly mapped cache by allowing blocks of data from any address in main-memory to be stored anywhere in the cache. However, one problem with fully associative caches is that the whole main-memory address must be used as a tag, thereby increasing the size of the tag-field and reducing cache capacity for storing data. Also, because the requested address must be compared simultaneously (associatively) with all tags in the cache, the access time for the cache is increased.
A set-associative cache, is a compromise between the direct mapped and fully associative designs. In this design, the cache is broken into sets each having a number, 2, 4, 8 etc., of cache-lines and each address in main-memory is assigned to a set and can be stored in any one of the cache-lines within the set. Typically, such a cache is referred to as a n-way set associative cache where n is the number of cache-lines in each set.
Memory addresses are mapped to the set-associative cache in a manner similar to the directly-mapped cache. For example, to map a 64-MB main-memory having 26 address bits to a 512-KB 4-way set associative cache the cache is divided into 4,096 sets of 4 cache-lines each and 16,384 addresses in main-memory associated with each set. Address bits A
5
to A
16
of a memory address represent the index of the set to which the address maps to. The me

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and method for a software controlled cache does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System and method for a software controlled cache, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for a software controlled cache will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3111151

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.