Electrical computers and digital processing systems: memory – Address formation – Address mapping
Reexamination Certificate
2002-01-07
2004-06-15
Sparks, Donald (Department: 2187)
Electrical computers and digital processing systems: memory
Address formation
Address mapping
C365S230010, C711S117000, C711S203000
Reexamination Certificate
active
06751720
ABSTRACT:
RELATED APPLICATIONS
This application is related to, and hereby incorporates by reference, the following U.S. patent applications:
Multiprocessor Cache Coherence System And Method in Which Processor Nodes And Input/output Nodes Are Equal Participants, Ser. No. 09/878,984, filed Jun. 11, 2001;
Scalable Multiprocessor System And Cache Coherence Method, Ser. No. 09/878,982, filed Jun. 11, 2001;
System and Method for Daisy Chaining Cache Invalidation Requests in a Shared-memory Multiprocessor System, Ser. No. 09/878,985, filed Jun. 11, 2001;
Cache Coherence Protocol Engine And Method For Processing Memory Transaction in Distinct Address Subsets During Interleaved Time Periods in a Multiprocessor System, Ser. No. 09/878,983, filed Jun. 11, 2001;
System And Method For Generating Cache Coherence Directory Entries And Error Correction Codes in a Multiprocessor System, Ser. No. 09/972,477, filed Oct. 5, 2001, which claims priority on U.S. provisional patent application 60/238,330, filed Oct. 5, 2000, which is also hereby incorporated by reference in its entirety.
FIELD OF INVENTION
The present invention relates generally to the design of cache memories in computer central processor units (CPU's), and particularly to the organization of two-level CPU caching systems in which the first-level cache is virtually indexed.
BACKGROUND OF THE INVENTION
The present invention is applicable to both single processor and multi-processor computer systems, but will be described primarily in the context of a distributed multi-processor system.
An “index position” within a cache identifies one or more cache lines within the cache. The number of cache lines stored at each index position is called the associativity of the cache. A direct mapped cache has an associativity of one. A two-way associative cache has an associativity of two, and thus has two cache lines at each index position of the cache.
A “memory line,” also often called a cache line, is the basic unit of storage that is stored in a memory cache. A memory line or cache line is also the basic unit of storage that is tracked by the cache coherence logic in multi-processor computer systems. A memory line of data will often be called a “memory line” while it is stored in main memory or is in transit between system nodes, and the same data may also be called a cache line while it is stored in a memory cache.
When a first-level (L
1
) cache is virtually indexed the “cache index bits” within the virtual address supplied by a processor are used to retrieve a tag from a tag array within the cache. Virtual indexing of a first-level (L
1
) cache allows the lookup of the L
1
cache tag to proceed concurrently with the translation of the requested virtual memory address into a physical memory address, sometimes herein called the targeted physical memory address. The virtual to physical address translation is performed by a translation look-aside buffer (“TLB”). The tag from the cache is then compared to the targeted physical memory address obtained from the TLB, and if there is a match and the cache state for the cache line is not “invalid” (which together indicate a cache hit), the data from the cache that corresponds to the tag is sent to the processor. If there is a miss, meaning that the retrieved tag did not match the physical address obtained from the TLB, the requested cache line of data must be obtained from a second-level cache or main memory.
While virtual indexing speeds up the lookup of a cache, it also may give rise to the possibility of synonyms. Synonyms are cache lines at different cache indices that map to the same physical memory address, and therefore refer to the same data entry. Synonyms may arise when a physical memory address is shared between two or more different programs or different parts of the same program, which may access it with two or more different virtual addresses. If the size of the cache divided by its associativity is greater than the size of the memory pages used in the system, a memory line at any given physical memory address can be stored at more than one index position within the cache. More specifically, the number N of cache line index positions at which any memory line may be found within the cache is equal to:
N
=
cache
⁢
⁢
size
associativity
×
pagesize
Having more than one cache index position correspond to the same physical memory address can give rise to a memory coherence problem if the data entry for one virtual memory address is changed without changing the data for another virtual memory address that maps to the same physical memory address. It is therefore necessary to either prevent synonyms from occurring or else to detect and resolve synonyms before they give rise to a memory coherence problem.
In addition, in the context of a shared memory multi-processor computer system with multiple first-level caches, it is also necessary to ensure that the cache coherence logic handling a request for a particular physical memory address be able to find any and all copies of the corresponding memory line, including those in first-level caches, even though there may be multiple L
1
cache index positions at which the identified memory line may be stored within any particular L
1
cache.
Since synonyms are only possible if the size of the first-level cache divided by its associativity is larger than the size of the system's memory pages, synonyms may be avoided by decreasing the size of the cache, increasing associativity, or increasing the size of the memory pages. Unfortunately, decreasing the size of the first-level cache reduces system performance, because it increases the number of cache misses. Increasing associativity greatly increase the complexity, and thus cost, of the L
1
caches, and may also reduce system performance by increasing the time required retrieve a cache line from the L
1
cache. Increasing the size of the system's memory pages is often not practical, because memory pages are the basic unit of memory used for many tasks, including memory allocation to processes, disk transfers and virtual memory management.
Alternatively, synonyms may be avoided at the system or kernel software level by restricting the assignment of virtual addresses by increasing the number of least significant address bits of the virtual addresses that must match the corresponding physical address. As a result of this restricted allocation of virtual addresses, all virtual addresses that correspond to a particular physical address will always have the same L
1
cache index. This last method of avoiding synonyms places a burden on system software policies and on the usage of virtual address spaces.
A possible method of resolving the problem of L
1
cache synonyms that was considered by the inventors, but rejected for reasons described next, is to build logic into the L
1
cache for detecting synonyms and resolving them. When an L
1
cache miss occurs, the logic would search for a synonym within the L
1
cache and abort the miss if a synonym is found. The cache line would then be copied from the location where the synonym was found to the location where the miss occurred, and the cache line at the original location would be invalidated. The main disadvantage of this method is that it would cause the first-level cache to be kept busy after every cache miss, while the first-level cache is searched for synonyms. Most of the time a synonym will not be found, however, because synonyms are rare in practice. Searching the first-level cache for synonyms after every miss reduces system performance by increasing the amount of time between cache requests by the processor coupled to the L
1
cache, and potentially reduces system performance by delaying the resolution of other subsequent L
1
cache accesses. In addition, in multiprocessor systems, this technique may reduce system performance by decreasing the amount of time that the cache is available for responding to cache coherence protocol requests. The impact on system performance may be especially severe for processor cores that aggressively
Barroso Luiz Andre
Gharachorloo Kourosh
Nowatzyk Andreas
Ravishankar Mosur Kumaraswamy
Stets, Jr. Robert J.
Chace Christian P.
Sparks Donald
LandOfFree
Method and system for detecting and resolving virtual... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and system for detecting and resolving virtual..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for detecting and resolving virtual... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3354002