Electrical computers and digital processing systems: memory – Address formation – Address mapping
Reexamination Certificate
1999-06-03
2002-08-20
Kim, Matthew (Department: 2188)
Electrical computers and digital processing systems: memory
Address formation
Address mapping
C711S202000, C711S216000, C711S221000, C712S217000
Reexamination Certificate
active
06438672
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to a addressable memory interface. More particularly, it relates to a method and apparatus to adaptively overlay a group of memory addresses to provide an efficient and flexible processor/memory interface.
2. Background of Related Art
Processors nowadays are more powerful and faster than ever. So much so that even memory access time, typically in tens of nanoseconds, is seen as an impediment to a processor running at its full speed. Typical CPU time of a processor is the sum of the clock cycles executing instructions and the clock cycles used for memory access. While modern day processors have improved greatly in the Instruction execution time, access times of reasonably priced memory devices have not similarly improved.
Thus, rather than relying on improvements in access speed of memory devices themselves, improved memory accessing methods and processor/memory interface architectures are employed in modern computer systems to minimize the above described bottleneck effect of memory access time.
For example, some processor/memory architectures take advantage of a memory-interleaving scheme in which consecutive data segments are stored across a number of banks of memory to allow parallel access to multiple memory locations and a large segment of data. Another particularly common memory access time enhancing method is memory caching. Caching takes advantage of the antithetical nature of the capacity and speed of a memory device. That is, a bigger (or larger storage capacity) memory is generally slower than a small memory. Also, slower memories are less costly, thus are more suitable for use as a portion of mass storage than are more expensive, smaller and faster memories.
In a caching system, memory is arranged in a hierarchical order of different speeds, sizes and costs. For example, as shown in
FIG. 6
, a smaller and faster memory, usually referred to as a cache memory
603
is placed between a processor
604
and larger, slower main memory
601
. Typically, a hierarchical division is made even within a cache memory, so that there ends up being two levels of cache memories in the system. In this layered cache system, the smaller and faster of the two levels of cache memories, typically called level one or L
1
, may be a small amount of memory embedded in the processor
604
. The second level or L
2
cache is typically a larger amount of memory external to the processor
604
.
The cache memory may hold a small subset of data stored in the main memory. The processor needs only a certain a small amount of the data in the main memory to execute individual instructions for a particular application. The subset of memory is chosen based on an immediate relevance, e.g., likely to be used in near future. This is much like borrowing only a few books at a time from a large collection of books in a library to carry out a large research project. Just as research may be just as effective and even more efficient if only a few books at a time were borrowed, processing of an application program is efficient if a small portion of the data was selected and stored in the cache memory at any one time.
A cache controller
602
monitors (i.e., “snoops”) the address lines of the bus
605
to the processor
604
, and whenever a memory access is made by the processor
604
, compares the address being accessed by the processor
604
with addresses of the small amount of data that is stored in the cache memory
603
. If data needed by the processor
604
is found in the cache memory
603
, a “cache hit” is said to have occurred, and the processor
604
is provided the required data from the faster cache memory
603
, analogous to finding the necessary information in the small number of books that were borrowed. If the information needed by the processor
604
is not stored in the cache memory
603
, a “cache miss” is said to have occurred, and an access to the slower main memory
601
must be made, analogous to making another trip to the library. As can be expected, a cache miss in the L
2
cache memory, which requires access to slower main memory
601
, is more detrimental than a cache miss in the L
1
cache memory, which only requires aa subsequent access to slightly slower L
2
cache memory.
Obviously, the goal is to increase cache hits (or to reduce cache misses). Typically, this goal is achieved by following what is called the “locality” theory. According to this theory, a temporal locality is based on the general axiom that if a particular piece of information was used, the same information is likely to be used again. Thus, data that was once accessed by the processor
604
is brought into the cache
603
to provide faster access during probable subsequent reference by the processor
604
. According to a second locality theory known as the spatial locality theory, when information is accessed by the processor
604
, information whose addresses are nearby the accessed information tend to be accessed as well. Thus, rather than storing only the once accessed data into the cache, a block of data, e.g. a page i, in the vicinity including the once accessed data is brought into the cache memory.
With every memory access by the processor
604
, these locality theories are used to decide which new page or pages of data are to be stored in the cache memory
603
. The new page replaces an existing page of data in cache
603
using a block (or page) replacement strategy, e.g., FIFO, random, or least recently used (LRU) methods, well known to designers and architects of computer systems.
While the use of cache memory in a memory/processor interface described above has provided a significant improvement in avoiding memory access time bottlenecks, and in preventing slow down of a processor otherwise capable of running at higher speed, the caching system described above suffers from significant drawbacks.
For example, cache thrashing occurs when a frequently used block of data is replaced by another frequently used block, thus causing a repeated fetching and displacement of the same block of data to and from the cache memory
603
. The thrashing may occur when the processor
604
is processing a set of instructions that has too many variables (and/or is simply too large) to fit into the cache memory. In this case, for example, when one particular variable is referenced by the processor
604
and is not present in the cache memory
603
, a cache miss would occur. If so, the variable must be retrieved from the main memory
601
and stored in the cache memory
603
for access by the processor
604
. However, because the cache memory
603
may already be full due to the storage of the large code segment, another variable must be removed to make room for the variable currently being referenced. Then when the processor
604
subsequently references the variable that was removed from the cache memory
603
, the above cache miss process is repeated. Thus, in this scenario, it is likely that blocks of data may be constantly fetched and replaced whenever the processor
604
references a particular variable.
The user may be aware of a particular set of information, e.g., common global variables, or set of common program codes, which are frequently referenced by the processor or are referenced by various components or applications in a particular computer system. Unfortunately, conventional processor/memory interface architectures are fixedly defined by a system designer, thus a user cannot remedy the above described problem even if the user is aware of a set of information that is expected to be frequently referenced by the processor.
The size of a large set of instructions (or programs) can be reduced significantly by use of common code segments that are shared with other sets of instructions. The program may include only a reference, e.g., jump or call instructions, to the common code segment that is stored separate from the program, thus is reduced in size. The reduced sized program may then fit in the available cache memory space, thus avoidin
Fischer Frederick Harrison
Segan Scott A.
Sindalovsky Vladimir
Anderson Matthew D.
Bollman William H.
Kim Matthew
LandOfFree
Memory aliasing method and apparatus does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Memory aliasing method and apparatus, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Memory aliasing method and apparatus will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2917563