Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories
Reexamination Certificate
2000-08-31
2003-11-25
Kim, Matthew (Department: 2186)
Electrical computers and digital processing systems: memory
Storage accessing and control
Hierarchical memories
Reexamination Certificate
active
06654858
ABSTRACT:
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT
Not applicable.
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention generally relates to reducing latency and directory writes in a multi-processor system. More particularly, the invention relates to reducing latency in a directory-based, multi-processor system. Still more particularly, the invention relates to eliminating directory write operations whenever possible in a directory-based coherence protocol.
2. Background of the Invention
Computer systems typically include one or more processors, memory, and many other devices. Often, the contents of memory are made available by a memory controller to the various other devices in the system. As such, two or more devices (e.g., two processors in a multi-processor system) may attempt to access the same block of memory at substantially the same time. Although being able to provide access to the same block of data by multiple devices in the system is highly desirable from a performance standpoint, it does necessitate taking steps to maintain the “coherency” of each data block.
In a multi-processor computer system, or any system for that matter in which more than one device may request concurrent access to the same piece of data, it is important to keep track of each block of data to keep the data coherent, meaning that the system accurately tracks the status of each data block and prevents two processors from changing two different copies of the same data. If two processors are given copies of the same data block and are permitted to change their copy, then the system at that point would have two different versions of what was previously the same data. The coherency problem is akin to giving two different people the permission to edit two different copies of the same document. Once their editing is complete, two different versions of the same document are present, whereas only one copy of the document is desired. A coherency protocol is needed to prevent this type of situation from happening.
One approach to the coherency problem in a multi-processor computer system is to provide a “directory” for each data block. The directory thus comprises a plurality of entries, one entry for each data block unit. Each directory entry generally includes information that reflects the current state of the associated data block. Such information may include, for example, the identity of which processors have a shared copy of the block or which processor in the system has the exclusive ownership of the block. Exclusive ownership of a data block permits the exclusive owner to change the data. Any processor having a copy of the block, but not having the block exclusive, can examine the data but cannot change the data. A data block may be shared between two or more processors. As such, the directory entry for that block includes information identifying which processors have a shared copy of the block. In general, a directory-based coherency protocol solves the problems noted above.
It is always desirable to enable computer systems to work faster and more efficiently. Anything that can be done to decrease latency in a computer generally makes the computer operate faster. Directory-based coherency computer systems are no exception; reducing the latency involved in such systems is desirable.
BRIEF SUMMARY OF THE INVENTION
The problems noted above are solved in large part by a computer system that has a plurality of processors. Each processor preferably has its own cache memory. Each processor or group of processors may have a memory controller that interfaces to a main memory, such as DRAM-type memory. The main memories include a “directory” that maintains the directory coherence state of each memory block.
One or more of the processors may be members of a “local” group of processors, such as might be the case if multiple processors are fabricated on the same chip. As such, the system might have multiple local processor groupings. Processors outside a local group are referred to as “remote” processors with respect to that local group.
Whenever a remote processor performs a memory reference (e.g., read or write) for a particular block of memory, the processor that maintains the directory for that block normally updates the directory to reflect that the remote processor now has exclusive ownership of the block. In accordance with the preferred embodiment of the invention, however, memory references between processors within a local group, do not result in a directory write. Instead, the cache memory of the local processor that initiated the memory requests places or updates a copy of the requested data in its cache memory and also sets associated tag control bits to reflect the same or similar information as would have been written to the directory. In this way, it is not necessary to write the directory for the requested block because the requesting processor's cache has the same information.
If a subsequent request is received for that same block, the local processor that previously accessed the block examines its cache for the associated tag control bits. Using those bits, that processor will determine that it currently has the block exclusive and provides the requested data to the new processor that is requesting the data. As such, the processor that maintains the directory for the block can ignore the directory entry.
By eliminating directory writes whenever possible, there is a significant latency improvement because of the relatively high bandwidth, low latency nature of processor cache subsystems and the avoidance of directory writes to memory. These and other benefits will become apparent upon reviewing the following disclosure.
REFERENCES:
patent: 5261066 (1993-11-01), Jouppi et al.
patent: 5317718 (1994-05-01), Jouppi
patent: 5758183 (1998-05-01), Scales
patent: 5761729 (1998-06-01), Scales
patent: 5787480 (1998-07-01), Scales et al.
patent: 5802585 (1998-09-01), Scales et al.
patent: 5809450 (1998-09-01), Chrysos et al.
patent: 5875151 (1999-02-01), Mick
patent: 5890201 (1999-03-01), McLellan et al.
patent: 5893931 (1999-04-01), Peng et al.
patent: 5918250 (1999-06-01), Hammond
patent: 5918251 (1999-06-01), Yamada et al.
patent: 5923872 (1999-07-01), Chrysos et al.
patent: 5943685 (1999-08-01), Arimilli et al.
patent: 5950228 (1999-09-01), Scales et al.
patent: 5964867 (1999-10-01), Anderson et al.
patent: 5983325 (1999-11-01), Lewchuk
patent: 6000044 (1999-12-01), Chrysos et al.
patent: 6070227 (2000-05-01), Rokicki
patent: 6085300 (2000-07-01), Sunaga et al.
patent: 6115804 (2000-09-01), Carpenter et al.
patent: 6141692 (2000-10-01), Loewenstein et al.
patent: 6338121 (2002-01-01), Nunez et al.
Alpha Architecture Reference Manual, Third Edition, The Alpha Architecture Committee, 1998 Digital Equipment Corporation (21 p.), in particular pp. 3-1 through 3-15.
A Logic Design Structure For LSI Testability, E. B. Eichelberger et al., 1977 IEEE (pp. 462-468).
Direct RDRAM™ 256/288-Mbit(512K×16/18×32s), Preliminary Information Document DL0060 Version 1.01) (69 p.).
Testability Features of AMD-K6™ Microprocessor, R. S. Fetherston et al., Advanced Micro Devices (8 p.).
Hardware Fault Containment in Scalable Shared-Memory Multiprocessors, D. Teodosiu et al., Computer Systems Laboratory, Stanford University (12 p.), 1977.
Cellular Disco: resource management using virtual clusters on shared-memory multiprocessors, K. Govil et al., 1999 ACM 1-58113-140-2/99/0012 (16 p.).
Are Your PLDs Metastable?, Cypress Semiconductor Corporation, Mar. 6, 1997 (19 p.).
Rambus® RIMM™ Module(with 128/144Mb RDRAMs), Preliminary Information, Document DL0084 Version 1.1 (12 p.).
Direct Rambus™ RIMM ™ Module Specification Version 1.0, Rambus Inc., SL-0006-100 (32 p.), 2000.
End-To-End Fault Containment In Scalable Shard-Memory Multiprocessors, D. Teodosiu, Jul. 2000 (148 p.).
Asher David H.
Bertone Michael
Kessler Richard E.
Lilly Brian
Choi Woo H.
Kim Matthew
LandOfFree
Method for reducing directory writes and latency in a high... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method for reducing directory writes and latency in a high..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for reducing directory writes and latency in a high... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3172817