Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories
Reexamination Certificate
2000-12-20
2003-05-06
Bragdon, Reginald G. (Department: 2188)
Electrical computers and digital processing systems: memory
Storage accessing and control
Hierarchical memories
C711S137000, C711S144000, C711S128000
Reexamination Certificate
active
06560679
ABSTRACT:
This application relies for priority upon Korean Patent Application No. 2000-30879, filed on Jun. 5, 2000, the contents of which are herein incorporated by reference in their entirety.
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to digital data processing systems such as computer systems. More particularly, the invention relates to cache memories in digital data processing systems and methods of operating the cache memories.
2. Description of the Related Art
A computer system generally comprises a central processing unit (CPU), a system bus, a memory subsystem, and other peripherals. The CPU executes instructions stored in the memory subsystem, and the bus serves as a communication pathway between the CPU and other devices in the computer system. The memory subsystem typically includes a slow and inexpensive primary, or “main”, memory, such as Dynamic Random Access Memory (DRAM), and fast and expensive cache memories, such as Static Random Access Memories (SRAMs).
Cache subsystems of a computer system are the result of a discrepancy in speed capability and price between SRAMs and DRAMs. This discrepancy lead to a architectural split of main memory into a hierarchy in which a small, relatively-fast SRAM cache is inserted in the computer system between a CPU and a relatively-slow, larger capacity, but less expensive, DRAM main memory.
A cache memory holds instructions and data which have a high probability of being desired for imminent processing by the CPU. By retaining the most-frequently accessed instructions and data in the high speed cache memory, average memory-access time will approach the access time of the cache. Therefore, use of caches can significantly improve the performance of computer systems.
Active program instructions and data may be kept in a cache by utilizing a phenomenon known as “locality of reference”. The locality of reference phenomenon recognizes that most computer program instruction processing proceeds in a sequential fashion with multiple loops, and with a CPU repeatedly referencing to a set of instructions in a particular localized area of a memory. Thus, loops and subroutines tend to localize the references to memory for fetching instructions. Similarly, memory references to data also tend to be localized, because table lookup routines or other iterative routines repeatedly refer to a relatively small portion of a memory.
In a computer system, a CPU examines a cache prior to a main memory when a memory access instruction is processed. If a desired word (data or program instruction) is found in the cache, the CPU reads the desired word from the cache. If the word is not found in the cache, main memory is accessed to read that word, and a block of words containing that word is transferred from the main memory to the cache by an appropriate replacement algorithm. If a cache has the word is wanted by CPU, it is called a “hit”; if not, it is called a “miss.”
A line of a simple cache memory usually consists of an address and one or more data words corresponding to that address. A line is also a minimum unit of information that can be moved between a main memory and a cache memory.
Data from a location in a main memory is stored on one line in a cache. Locations of a cache need to be identified. This is done by taking a portion of a main memory address. Also, because there are fewer cache lines than main memory blocks, an algorithm is needed for determining which main memory blocks are read into cache lines.
Various techniques are known for mapping blocks of a main memory into a cache memory. Typical forms of mapping include direct mapping, fully associative mapping, and set associative mapping.
Direct mapping technique maps each block of a main memory into only one possible cache line. This technique is simple and inexpensive to implement, but its primary disadvantage is that there is a fixed location for any given block. Thus, if a program happens to reference repeatedly from two different blocks that map into the same line, then the blocks will be continuously swapped in the cache, and their hit ratio will be low.
Fully associative mapping overcomes the drawbacks of direct mapping by permitting each main memory block to be loaded into any line of a cache. With this technique, there is flexibility as to which block to replace when a new block is read into a cache. A principal disadvantage of this technique is the complex circuitry to examine tags of all cache lines in parallel.
Set associative mapping (usually referred to as “N-way set associative mapping”) is a compromise that exhibits the strengths of both direct and fully associative approaches. In this technique, a cache is divided into plural sets, each of which consists of several lines. This technique maps a block of main memory into any of the lines of set and permits the storage of two or more data words in a cache memory at the same set address (i.e., in one line of cache). In this approach, cache control logic interprets a main memory address simply as is three fields: a set, a tag, and a word. With set associative mapping, tag in a main memory address is relatively small and is only compared with tags within a single set, unlike the fully associative mapping wherein tag in a main memory address is quite large and must be compared to the tag of every line in a cache.
Performance of cache memories is frequently measured in terms of a “hit ratio.” When a CPU references a cache memory and finds a desired instruction or data word in the cache, the CPU produces a hit. If the word is not found in the cache, then the word is in a main memory and the cache access counts as a miss. The ratio of the number of hits divided by the total CPU references to memory (i.e. hits plus misses) is the hit ratio.
To maximize hit ratio, many computer system organizations and architectures allow system control over the use of caches. For example, a cache may be used to store instructions only, data only, or both instructions and data. The design and operation principles of cache memories are described in detail in several handbooks, for example, entitled “Advanced Microprocessors,” by Daniel Tabak, McGraw-Hill Book Co., Second Edition (1995), Chap. 4, pp. 43-65; “Computer Organization And Architecture,” by William Stalling, Prentice-Hall, Inc., Fifth Edition (1996), Chap. 4, pp. 117-151; and “High Performance Memories,” by Betty Prince, John Wiley & Sons, Inc., (1996), Chap. 4, pp. 65-94, which are hereby incorporated herein by reference.
To identify whether a cache hit or a cache miss occurs, that is, to know if a desired word is found in a cache, it is always necessary to access tag stored in the cache. Due to the current trends toward increasing cache size for high performance requirements (it is known that hit ratio of a simple cache tends to go up as the size of cache goes up.), the number of repetitive tag accesses in memory reference cycles increases. This results in more power consumption in caches and so hampers applying such caches to low power applications.
SUMMARY OF THE INVENTION
An object of the present invention is accordingly to provide methods and apparatuses for reducing power consumption of and improving performance of cache integrated circuit memory devices.
To attain the object, the present invention recognizes that a cache hit always occurs when current access is applied to instructions and/or data on the same cache line that was accessed and hit in the most recent access, and that if a miss occurred during the preceding access, a hit/miss of current access to the same line depends on whether or not a “cache line fill” (in which a complete cache line is read from main memory into cache memories) for the same line has been performed.
According to an aspect of the present invention, a digital data processing system is provided which includes a digital data processor, a cache memory having a tag RAM and a data RAM, and a controller for controlling accesses to the cache memory. The controller stores state information on access type, operation mode and cache
Choi Hoon
Yim Myung-Kyoon
Bragdon Reginald G.
Inoa Midys
Mills & Onello LLP
Samsung Electronics Co,. Ltd.
LandOfFree
Method and apparatus for reducing power consumption by... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for reducing power consumption by..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for reducing power consumption by... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3043694