Unified cache port consolidation

Electrical computers and digital data processing systems: input/ – Access arbitrating – Decentralized arbitrating

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C710S119000

Reexamination Certificate

active

06704820

ABSTRACT:

TECHNICAL FIELD
The invention relates to computer memory systems. More particularly, the invention relates to accessing cache memories.
BACKGROUND ART
In a computer system, the interface between a processor and memory is critically important to the performance of the system. Because fast memory is very expensive, memory in the amount needed to support a processor is generally much slower than the processor. In order to bridge the gap between fast processor cycle times and slow memory access times, cache memory is utilized. A cache is a small amount of very fast memory that is used to store a copy of frequently accessed data and instructions from main memory. A processor can operate out of this very fast memory and thereby reduce the number of wait states that must be interposed during memory accesses. When the processor requests data from memory and the data resides in the cache, then a cache read hit takes place, and the data from the memory access can be returned to the processor from the cache without incurring the latency penalty of accessing main memory. If the data is not in the cache, then a cache read miss takes place, and the memory request is forwarded to the main memory, as would normally be done if the cache did not exist. On a cache miss, the data that is retrieved from the main memory is provided to the processor and is also written into the cache due to the statistical likelihood that this data will be requested again by the processor in the near future.
The individual data elements stored in a cache memory are referred to as lines. Each line of a cache is meant to correspond to one addressable unit of data in the main memory. A cache line thus comprises data and is associated with a main memory address in some way. Schemes for associating a main memory address with a line of cache data include direct mapping, full association and set association, all of which are well known in the art.
The presence of a cache should be transparent to the overall system, and various protocols are implemented to achieve such transparency, including write-through and write-back protocols. In a write-through action, data to be stored is written to a cache line and to the main memory at the same time. In a write-back action, data to be stored is written to the cache and only written to the main memory later when the line in the cache needs to be displaced for a more recent line of data or when another processor requires the cached line. Because lines may be written to a cache exclusively in a write-back protocol, precautions must be taken to manage the status of data in a write-back cache so as to preserve coherency between the cache and the main memory. The preservation of cache coherency is especially challenging when there are several bus masters that can access memory independently. In such a case, well known techniques for maintaining cache coherency include snooping.
A cache may be designed independently of the microprocessor, in which case the cache is placed on the local bus of the microprocessor and interfaced between the processor and the system bus during the design of the computer system. However, as the density of transistors on a processor chip has increased, processors may be designed with one or more internal caches in order to decrease further memory access times. An internal cache is generally small, an exemplary size being 256Kb (262,144 bytes) in size. In computer systems that utilize processors with one or more internal caches, an external cache is often added to the system to further improve memory access time. The external cache is generally much larger than the internal cache(s), and, when used in conjunction with the internal cache(s), provides a greater overall hit rate than the internal cache(s) would provide alone.
In systems that incorporate multiple levels of caches, when the processor requests data from memory, the internal or first level cache is first checked to see if a copy of the data resides there. If so, then a first level cache hit occurs, and the first level cache provides the appropriate data to the processor. If a first level cache miss occurs, then the second level cache is then checked. If a second level cache hit occurs, then the data is provided from the second level cache to the processor. If a second level cache miss occurs, then the data is retrieved from main memory (or higher levels of caches, if present). Write operations are similar, with mixing and matching of the operations discussed above being possible.
Caches are also categorized on the basis of the type of information stored in their contents. For example, a data cache stores data (i.e., operands, integers, floating point values, packed representations and other formats of raw data). On the other hand, an instruction cache stores instructions (e.g., op codes or execution syllables with or without immediate data embedded in the instruction). If a single cache is utilized to store information of diverse types (e.g., data and instructions), then it is called a unified cache. A unified cache offers greater flexibility than one or more non-unified caches, in that the unified cache can flexibly store different types of information and therefore achieve a more efficient utilization of the valuable cache memory space.
A unified cache
100
is illustrated in FIG.
1
. The unified cache
100
comprises a memory array
105
, each element of which can store a unit of data or an instruction. The cache
100
also comprises a plurality of address ports. Each address port accepts an address bus. The address buses are shown alternately as data address buses DATA and instruction address buses INST. The width of each address bus is M bits. A given address uniquely identifies one cache line, or a subset of that one line, in the memory array
105
. A conflict resolution and address decoder bank
110
processes the addresses on the address buses DATA and INST. The conflict resolution processing is described in detail below. The address decoding processing for each address bus involves decoding the address and selectively asserting word lines that access the addressed word in the memory array
105
, such that the addressed word is connected to an I/O module
115
. The I/O module
115
comprises drivers and sense amplifiers to write and read the addressed memory word, respectively, as determined by cache control logic. One or more I/O buses, coupled to the I/O module
115
, accept or provide the addressed word(s).
Because a unified cache has a larger number of connections than a non-unified cache, a unified cache faces a substantially greater burden for resolving address conflicts. Address conflicts arise when two or more connections access the same memory cell at the same time. Resolving address conflicts in a rationale manner is important to avoid inconsistencies in the cache contents. For example, in
FIG. 1
, if the top address bus DATA accesses a particular cache line for writing, and the bottom instruction address bus INST attempts to read the same cache line, then it is important that the two operations proceed in the proper order. Otherwise, the wrong information would be read. To detect address conflicts, the conflict resolution and address decoder bank
110
contains logic that compares the address on each address bus to every other connection. Each comparison circuit requires M
2
-input exclusive OR (XOR) gates, if the address buses are M bits wide. The number of comparison circuits increases as the square of the number of address buses to the cache
100
. Specifically, N address buses to the cache
100
require “N choose 2” or (N
2
−N)/2 comparison circuits. Thus, a small increase in the number of address buses results in a significant increase in the necessary comparison circuits. For example, if there are four address buses to the cache
100
, then six comparison circuits are necessary; whereas, if there are eight address buses to the cache
100
, then
28
comparison circuits are necessary.
SUMMARY OF THE INVENTION
In one respect, the invention is an apparatus for using a plu

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Unified cache port consolidation does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Unified cache port consolidation, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Unified cache port consolidation will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3269459

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.