Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories
Reexamination Certificate
2000-09-29
2003-11-18
Verbrugge, Kevin (Department: 2188)
Electrical computers and digital processing systems: memory
Storage accessing and control
Hierarchical memories
C711S130000, C711S141000
Reexamination Certificate
active
06651145
ABSTRACT:
FIELD OF THE INVENTION
This invention relates generally to shared storage hierarchies in multiprocessing systems, and in particular to use of an exclusive dirty status in a coherence protocol to disambiguate ownership and modification status for memory references in a shared multi-level storage hierarchy.
BACKGROUND OF THE INVENTION
In a multiprocessing system with a shared multi-level storage hierarchy, typically comprising a shared cache storage, one processor may request access from a shared cache storage to data that is in a state of ownership by another processor. The requesting processor does not know if the requested data in the shared cache storage is valid or if it has been modified by another processor in a private storage at another level of the storage hierarchy. Therefore the requested data in the shared cache storage is not useful to the requesting processor until its actual status can be determined. An ambiguous status for data in shared cache storage hierarchies is referred to as a “sharing ambiguity.”
One method for resolving sharing ambiguities in multiprocessing systems makes use of a common bus to snoop transactions (M. Papamarcos and J. Patel, “A Low-Overhead Coherence Solution for Multiprocessors with Private Cache Memories,” Proc. 11th ISCA, 1984 pp. 348-354). Snooping transactions on a common bus to maintain coherence increases traffic to a processor's private cache storage. This increased traffic is not necessarily related to data that is actually needed by the respective processor. Therefore a disadvantage of snooping is that the average latency of cache accesses is increased since requests for data have to compete with snoops for access to the private caches. Moreover, the competition increases with the number of processors sharing a common bus. As a consequence, overall system performance suffers due to slower average access times. A second disadvantage of snooping can occur because an access to requested data in the shared cache storage must wait until results of snooping are received.
Another method for resolving sharing ambiguities involves broadcasts of inquiries (commonly referred to as disambiguating inquiries or backward inquiries) over an interconnection network shared by the processors with access to the shared cache storage. When requested data is found to be in an ambiguous state, an inquiry is broadcast to the other processors. Again, latency increases since the requesting processor must wait until responses to the broadcast inquiry are received. As the number of processors sharing a cache storage increases, so does the potential number of broadcasts and responses—contributing to increased network congestion.
In addition to the increases in latency associated with these prior methods, there are also additional costs associated with providing hardware functionality in order to implement a particular chosen method. Hardware functionality requires additional circuitry, and additional circuitry requires increased silicon area. An undesirable secondary effect of additional hardware circuitry and increased silicon area is an increase in the number and severity of critical timing paths, potentially resulting in further performance degradation for the overall system.
Another method used in distributed systems is known as SCI (Scalable Coherent Interface, IEEE Std 1596-1992
Scalable Coherent Interface,
Piscataway, N.J.). SCI supports a one-writer-multiple-reader format with a distributed doubly linked list that is maintained through main memory. Addresses of private cache storage are inserted onto the list in a controlled manner and only the address at the head of the list may overwrite the data. The interface maintains a coherent storage hierarchy, by forwarding data requests to the head of the list. One disadvantage of such a system is that before data may be overwritten the list must be sequentially purged. For large distributed systems, delays associated with such a method potentially contribute to performance degradation of the overall system. In addition to the potential for undesirable network congestion inherent in such a distributed system, problematic issues of link maintenance in cases of distributed system failures must also be addressed.
REFERENCES:
patent: 5241664 (1993-08-01), Ohba et al.
patent: 5524212 (1996-06-01), Somani et al.
patent: 5680576 (1997-10-01), Laudon
patent: 5737568 (1998-04-01), Hamaguchi et al.
patent: 0 397 994 (1990-11-01), None
Censier, L. M., et al., “A New Solution to Coherence Problems in Multicache Systems,” IEEE Transactions on Computers, vol. C-27, No. 12, Dec. 1978, 7 pages.
Pong, F., et al., “Correctness of a Directory-Based Cache Coherence Protocol: Early Experience,” IEEE, 1993, pp. 37-44.
PCT International Search Report, PCT/US01/30359, mailed Oct. 8, 2002, 5 pages.
Lilja, D. J.. Cache Coherence in Large-Scale Shared-Memory Multiprocessors: Issues and Comparisons, ACM Computing Surveys, 25(3), Sep. 1993.
M. Papamarcos and J. Patel, “A Low-Overhead Coherence Solution for Multiprocessors with Private Cache Memories,” Proc. 11th ISCA, 1984 pp. 348-354.
IEEE Std 1596-1992.Scalable Coherent Interface. Piscataway, NJ.
R.E. Johnson.Extending the Scalable Coherent Interface for Large-Scale Shared-Memory Multiprocessors. PhD thesis, University of Wisconsin-Madison, 1993.
Kourosh Gharachorloo, Anoop Gupta, and John Hennessy. Performance Evaluation of Memory Consistency Models for Shared-Memory Multiprocessors. InProceedings of Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 245-257, 1991.
Kourosh Gharachorloo, Daniel Lenoski, James Laudon, Phillip Gibbons, Anoop Gupta, and John Hennessy. Memory Consistency and Event Ordering in Scalable Shared-Memory Multiprocessors. InProceedings of the 17th International Symposium on Computer Architecture, pp. 15-26, May 1990.
Christoph Scheurich and Michel Dubois. Correct Memory Operation of Cache-Based Multiprocessors. InProceedings 14th Annual International Symposium on Computer Architecture, pp. 234-243, Pittsburgh, PA, Jun. 1987.
A. Agarwal, D. Chaiken, G. D'Souza, et al. The MIT Alewife machine: A large-scale distributed-memory multiprocessor. Technical Report MIT/LCS Memo TM-454, Laboratory for Computer Science, Massachusetts Institute of Technology, 1991.
R. Bianchini, M. E. Crovella, L. Kontoothanassis, and T. J. LeBlanc. Memory contention in scalable cache-coherent multiprocessors. Technical Report 448, Computer Science Department, University of Rochester, 1993. Title Page missing.
K. Farkas, Z. Vranesic, and M. Stumm. Cache consistency in hierarchical-ring-based multiprocessors. Tech. Rep. CSRI-273, Computer Systems Research Institute, Univ. of Toronto, Ontario, Canada, Jan. 1993.
B. Fleisch and G. Popek. Mirage: A coherent distributed shared memory design. InProceedings from the 14th ACM Symposium on Operating System Principles, pp. 211-223, New York, 1989.
S. Mori, H. Saito, M. Goshima, et al. A distributed shared memory multiprocessor: ASURA—memory and cache architectures-. InSupercomputing '93, pp. 740-749, Portland, Oregon, Nov. 1993.
K. Li and P. Hudak. Memory coherence in shared virtual memory systems.ACM Transactions in Computer Systems, 7(4):321-359, Nov. 1989.
Q. Li and S. Vlaovic. Redundant linked list based cache coherence protocol. InWorld Computer Congress, IFIP Congress, 1994.
John M. Mellor-Crummey and Michael L. Scott. Algorithms for scalable synchronization on shared-memory multiprocessors.ACM Trans. on Computer Systems, 9(1), Feb. 1991.
H. Nilsson and P. Stenstrom. The scalable tree protocol—a cache coherence approach to large-scale multiprocessors. InProceedings of the 4th IEEE Symposium on Parallel and Distributed Processing, May 1992.
C.K. Tang. Cache system design in the tightly coupled multiprocessor system. InAFIPS Proceedings of the National Computer Conference, 1976.
Craig-Anderson and Jean-Loup Baer. Design and Evaluation of a Subblock Cache Coherence Protocol for Bus-Based Multiprocessors. University of Washington, 1994, 94-05-02.
Anderson, C. and J.-L.
Jamil Sujat
Merrell Quinn
Nguyen Hang
Blakely , Sokoloff, Taylor & Zafman LLP
Intel Corporation
Verbrugge Kevin
LandOfFree
Method and apparatus for scalable disambiguated coherence in... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for scalable disambiguated coherence in..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for scalable disambiguated coherence in... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3121600