Method and system for exclusive two-level caching in a...

Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S118000, C711S119000, C711S130000, C711S141000, C711S147000, C711S150000, C711S120000, C711S117000, C711S145000, C711S124000

Reexamination Certificate

active

06725334

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
This application relates to microprocessor design and, specifically, to cache memory systems in microprocessors.
2. Background Art
The performance of applications such as database and web servers (hereafter “commercial workloads”) is an increasingly important aspect in high-performance servers. Data-dependent computations, lack of instruction-level parallelism and large memory stalls contribute to the poor performance of commercial workloads in traditional high-end microprocessors.
Two promising approaches for improving the performance of commercial workloads are lower-latency memory systems and the exploitation of thread-level parallelism. Increased density and transistor counts enable microprocessor architectures with integrated caches and memory controllers, which reduce overall memory latency. Thread-level parallelism arising from relatively independent transactions or queries initiated by individual clients enables the exploitation of thread-level parallelism at the chip level. Chip multiprocessing (CMP) and simultaneous multithreading (SMT) are the two most promising approaches to exploit such thread-level parallelism. SMT enhances a traditional wide-issue out-of-order processor core with the ability to issue instructions from different threads in the same cycle. CMP consists of integrating multiple CPU cores (and corresponding level-one caches) into a single chip.
The main advantage of the CMP approach is that it enables the use of simpler CPU cores, therefore reducing overall design complexity. A CMP approach naturally lends itself to a modular design, and can benefit from the on-chip two-level caching hierarchy. In the on-chip two-level caching hierarchy, each first-level cache is associated with and is private to a particular CPU and the second-level cache is shared by the CPUs. However, conventional CMP designs with on-chip two-level caching require the contents of first-level caches to be also present in the second-level caches, an approach known as the inclusion or subset property. With an inclusive two-level caching implementation, an increase in the number of CPUs per die increases the ratio between the aggregate first-level cache capacity and the second-level cache capacity. When this ratio approaches 1.0, nearly half of the on-chip cache capacity can be wasted with duplicate copies of data. Hence, a design that does not enforce inclusion (e.g., an exclusive design) is advantageous and often preferred over the design of inclusive two-level caching.
Exclusive two-level caching has been previously proposed in the context of single processor chips. An example of exclusive two-level caching implemented in a single processor is provided in U.S. Pat. No. 5,386,547, issued to Norman P. Jouppi on Jan. 31, 1995, which is incorporated herein by reference. this invention is the first to address it for CMP systems. This invention also describes new mechanisms to manage effectively a two-level exclusive cache hierarchy for a CMP system.
But, even with exclusive two-level caching, there are performance issues to be addressed in CMP design. Particularly, there is a need to improve mechanisms for effective management of exclusive two-level caching in CMP systems. The present invention addresses these and related issues.
SUMMARY OF THE INVENTION
Hence, in accordance with the purpose of the invention, as embodied and broadly described herein, the invention relates to chip multiprocessors (CMP) design. In particular, the present invention provides a system and method that maximizes the use of on-chip cache memory capacity in CMP systems. The system and method are realized with a combination of features. One such feature is a relaxed subset property (inclusion) requirement. This property forms an exclusive cache hierarchy in order to minimize data replication and on-chip data traffic without incurring an increased second level hit latency or occupancy. Another aspect of the combination involves maintaining in the second-level cache a duplicate tag-state structure of all (per-CPU) first-level caches in order to allow a substantially simultaneous lookup for data in the first-level and second-level tag-state arrays.
An additional aspect involves extending the state information to include ownership indication in addition to the data validity/existence indication and data shared/exclusive indication. The ownership aspect lives in the exclusive two-level cache hierarchy and helps orchestrate write-backs to the second-level cache (i.e., L2 fills). Another aspect involves associating a single owner with each cache line in order to eliminate redundant write-backs of evicted data to the second-level cache. Namely, at any given time in the lifetime of a cache line in the CMP chip, only one of its copies can be the owner copy.
Finally, the present invention provides policy-guidelines for administering the ownership and write-back aspects, as the following guidelines exemplify: 1) a first-level cache miss that finds no other copy of a requested cache line becomes the owner of the cache line; 2) a first-level cache miss that does not find a copy of a cache line in the second-level cache but finds it in one or more than one of the first-level caches receives that cache line from the previous owner and becomes the new owner; 3) a first-level cache that replaces a cache line, is informed by the second-level cache whether it is the owner, in which case it issues a second level cache fill; 4) whenever the second-level cache has a copy of the cache line, it is the owner. A first-level cache miss that hits in the second-level cache without invalidating it (i.e., not a write miss) does not steal ownership from the second-level cache; and 5) whenever the second-level cache needs to evict a cache line that is additionally present in one or more first-level caches the second-level cache arbitrarily selects one of these first-level caches as the new owner.
Advantages of the invention will be understood by those skilled in the art, in part, from the description that follows. Advantages of the invention will be realized and attained from practice of the invention disclosed herein.


REFERENCES:
patent: 5197139 (1993-03-01), Emma et al.
patent: 5210848 (1993-05-01), Liu
patent: 5386547 (1995-01-01), Jouppi
patent: 5634068 (1997-05-01), Nishtala et al.
patent: 5875462 (1999-02-01), Bauman et al.
patent: 6292705 (2001-09-01), Wang et al.
patent: 6374332 (2002-04-01), Mackenthun et al.
patent: 6625698 (2003-09-01), Vartti
patent: 6636948 (2003-10-01), Steely, Jr. et al.
Agarwal, Anant, et al., “An Evaluation of Directory Schemes for Cache Coherence”,Proceedings of 15thInternational Symposium on Computer Architecture(“ISCA”) (May 1998) pp. 280-289.
Barroso, Luiz Andre, et al., “Impact of Chip-Level Integration on Performance of OLTP Workloads”,High-Performance Computer Architecture(“HPCA”) (Jan. 2000).
Barroso, Luiz Andre, et al., “Memory System Characterization of Commercial Workloads”,ISCA(Jun. 1998).
Eggers, Susan J., et al., “Simultaneous Multithreading: A Platform for Next-generation Processors”,University of Washington, DEC Western Research Laboratory({eggers,levy,jlo}@cs.washington.edu) ({emer,stamm}@vssad.enet.dec.com) pp. 1-15.
Eickemeyer, Richard J., et al., “Evaluation of Multithreaded Uniprocessors for Commercial Application Environments”,ACM(1996) (0-89791-786-3) pp. 203-212.
Gupta, Anoop, et al., “Reducing Memory and Traffic Requirements for Scalable Directory-Based Cache Coherence Schemes”,Stanford University, Computer Systems Laboratorypp. 1-10.
Hammond, Lance, et al., “A Single-Chip Multiprocessor”,IEEE(Sep. 1997) (0018-9162).
Hammond, Lance, et al., “Data Speculation Support for a Chip Multiprocessor”,Stanford University, Computer Systems Laboratory(http://www-hydra.stanford.edu/).
Jouppi, Norman P., et al., “Tradeoffs in Two-Level On-Chip Caching”,WRL Research Report 93/3, Western Research Laboratory(WRL-Techreports@decwrl.dec.com) (Dec. 1993) pp. 1-31.
Krishnan, Venkata, et al., “Hardware and Softwa

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and system for exclusive two-level caching in a... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and system for exclusive two-level caching in a..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for exclusive two-level caching in a... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3269772

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.