Method and apparatus for a dedicated physically indexed copy...

Electrical computers and digital processing systems: memory – Address formation – Address mapping

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S003000, C711S203000

Reexamination Certificate

active

06253301

ABSTRACT:

BACKGROUND OF THE INVENTION
In general, main memory access is relatively slow compared to central processing unit (CPU) execution times. Therefore, most CPU architectures include one or more caches. A cache is a high-speed memory which can be associated with a small subset of referenced main memory. Because most memory reference patterns only require a small subset of the main memory contents, a relatively smaller, high-speed cache can service many of the memory references.
For example, instruction caches can improve efficiency because often in software programs a small section of code may be looping. By having the instructions in a high-speed, local instruction cache, they are accessed much faster. Data caches can likewise improve efficiency because data access tends to follow the principle of locality of reference. Requiring each access to go to the slower main memory would be costly. The situation can be even worse in a multi-processor environment where several CPUs may contend for a common bus.
Data cache systems in some configurations comprise both a data store and a tag array. The data store holds data copied from the main memory. Each tag array location holds a tag, or physical page address, for a block of consecutive data held in the data store in association with the tag location.
During a memory access, a virtual page address from the CPU core is translated by a page translator into a physical page address. The remainder of the address, or a portion thereof, is used to index into the tag array. The tag retrieved from the indexed tag array is compared with the translated physical page address, a match indicating that the referenced data is in the data store. A mismatch indicates that the data will have to be retrieved from main memory. Page translation occurs in parallel with the tag array lookup, minimizing delay.
A need also exists in multiprocessor systems to test the contents of the data cache system from outside the CPU. Several processors may reference the same physical address in memory. Besides looking up its own local cache, each CPU must check the caches of other CPUs in the system. Failure to do so would result in data incoherency between the individual caches as each CPU reads and writes to its own local copy of the same data from main memory.
To prevent this incoherency, a CPU sends “probes” to other CPUs during a memory reference. Each data cache system receiving a probe uses a physical address provided by the probe to look into its own tag array. If the data resides in its data store, the data cache system responds to the probing CPU accordingly allowing ownership arbitration to take place.
SUMMARY OF THE INVENTION
A problem with the physically-addressed, physically-tagged data caching system in a virtually-addressed computer architecture as described above is that the cache is limited to the size of a memory page. This results because address bits which are not part of the page address are the only unmapped bits and thus are the only bits that can be used to index the cache. As capacity for larger caches grows, the size limitation takes on greater import.
The present invention resolves this problem by indexing the tag array and data cache using virtual page address bits with the assumption that the bits used to index the tag array are the same for the corresponding physical page address. If the assumption is correct, a cache hit is correctly detected. This enables a four-fold increase in the size of the cache in one embodiment.
Another problem with the prior art is contention for the tag array. Functions external to the CPU core, such as probes from other CPUS, contend with the CPU core's own need to access the tag array. Whenever the tag array is servicing a probe, the tag array is unavailable to the CPU core to determine if data the CPU core needs is in the data store. Therefore the CPU core has to wait for the probe to be serviced. The present invention resolves this by providing a duplicate tag array to service the probes.
Accordingly, a preferred embodiment of the present invention comprises a data store for caching data from a main memory, a primary tag array for holding tags associated with data cached in the data store, and a duplicate tag array for holding copies of the tags held in the primary tag array. The duplicate tag array is accessible by external functions such as probes so that the primary tag array remains available to the processor core.
A page address from a memory address provided by an external probe is compared with a tag read from the duplicate tag array location indexed by the index portion of the memory address. If there is a match, the data addressed by the memory address is currently cached in the data store. Otherwise the output indicates that the addressed data is not currently cached in the data store.
The preferred embodiment of the present invention comprises an address translator which maps virtual page addresses of virtual addresses to physical page addresses, wherein a virtual address comprises a virtual page address and an unmapped index portion. A tag array holds tags associated with the data cached in the data store, and is referenced by indexes comprising portions of the virtual page addresses and unmapped index portions. A physical page address is compared with tags read from the tag array, a match indicating a hit. If there is a miss, other possible values are substituted for the virtual portion of the index in order to check other possible tag array locations for a hit.
The tags can be read and compared by sequentially substituting for the virtual portion of the index until a match is detected indicating a hit, or alternatively and preferably, multiple tags are read and compared in parallel using a plurality of comparators.


REFERENCES:
patent: 4551797 (1985-11-01), Amdahl et al.
patent: 4727482 (1988-02-01), Roshon-Larsen et al.
patent: 4731739 (1988-03-01), Woffinden et al.
patent: 5418922 (1995-05-01), Liu
patent: 5515522 (1996-05-01), Bridges et al.
patent: 5579503 (1996-11-01), Osborne
patent: 5603004 (1997-02-01), Kurpanek et al.
patent: 5978886 (1999-11-01), Moncton et al.
patent: 6038647 (2000-03-01), Shimizu
“Efficient Hardware Functions for Higher Performance Computers,” M.V. Ramakrishna, E. Fu, and E. Bahcekalli, IEEE Transactions on Computers, vol. 46, No. 12, Dec. 1997, pp 1378-1381.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for a dedicated physically indexed copy... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus for a dedicated physically indexed copy..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for a dedicated physically indexed copy... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2447838

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.