Electrical computers and digital processing systems: memory – Storage accessing and control – Shared memory area
Reexamination Certificate
2000-06-02
2004-09-07
Portka, Gary (Department: 2188)
Electrical computers and digital processing systems: memory
Storage accessing and control
Shared memory area
C711S141000, C370S352000
Reexamination Certificate
active
06789173
ABSTRACT:
BACKGROUND OF THE INVENTION
The present invention relates to a multiprocessor system in which processors share a main memory and which uses such a snoop scheme as to distribute an address of a requesting cache line to all the processors for coherency control and more particular, to a multiprocessor including nodes each of which has CPUs, a main memory and a cache memory unit and which has a structure capable of being used as a part common to both small and large systems, and also to a node controller in each node.
A prior art system, in which nodes each having CPUs and a main memory mounted on an identical board are commonly used to both small and large multiprocessor systems, is described in James Laudon, et al., System Overview of the SGI Origin 200/2000 Product Line. Proceeding of the 47th IEEE COMPUTER SOCIETY INTERNATIONAL CONFERENCE, pp. 150-156, February, 1997.
The origin 200/200 includes one or more nodes, each node having two CPUs, a main memory and a hub chip.
The hub chip has a communication interface controller with the CPUs, a communication interface controller with the main memory and directory, an external I/O interface controller, and a crossbar for coupling these interface controllers.
The Origin 200 corresponding to a small multiprocessor system includes usually one sheet of node board and in some cases, includes two sheets of node boards directly connected by an external I/O interface from the hub chip. The Origin 2000 corresponding to a large multiprocessor system includes two or more node boards mounted on the crossbar and connected by a rooter board. The Origin 200 and Origin 2000 will be referred to merely as the Origin without drawing a distinction therebetween, in the following description.
As in the Origin, various types of systems can be formed using a plurality of identical nodes regardless of the system size or scale. This is an effective means of reducing its development costs and shortening a development period of time.
The Origin also performs directory type cache coherency control with use of a (cache-coherent non-uniform memory access) ccNUMA type multiprocessor.
This control is already detailed in James laudon, et al, The SGI Origin: A ccNUMA Highly Scalable Server, Proceeding of 24
th
Annual Symposium on Computer Architecture, pp. 241-251, June, 1997.
The memory access of the Origin is carried out usually in the following manner. A memory access request issued from a CPU is transferred to a node having a main memory having a requesting address present therein to search the node for a directory. A directory, which is provided for each cache line corresponding to the requested address, records therein the directory is transferred to the cache memory of which node in what state.
As a result of searching for the directory, data read out from the cache memory of the found node or from the main memory is transferred to the CPU as the request issuance originator.
A prior art crossbar having an arbiter for determining a processing sequence of a memory access request for cache coherency control is shown in U.S. Pat. No. 6,011,791. It is generally already known that a crossbar performs data parallel transfer and is higher in throughput performance than a bus.
However, there is a possibility that the memory access sequence may be partially reversed to disturb the cache coherency.
The cache coherency is maintained by the directory system in the Origin; whereas, the cache coherency is controlled by uniquely sequencing memory access requests issued from CPUs in an arbiter which performs logically unique operation within the crossbar in such a multiprocessor system as shown in U.S. Pat. No. 6,011,791.
The sequenced requests are transmitted to the CPUs, main memory and I/O controller through a selector within the crossbar. A system for realizing snooping cache by providing the crossbar with a function of sequencing the memory access requests to broadcast the memory access requests to all the CPUs in this way, will be referred to as the multicasting system, hereinafter.
Further, a crossbar having a function of sequencing memory access requests to realize the multicasting system will be referred to as the multicasting crossbar, hereinafter.
SUMMARY OF THE INVENTION
However, the multiprocessor system disclosed in the U.S. Pat. No. 6,011,791 is not arranged to be able to form a small system by using a small number of nodes used only in a large system and directly connecting these nodes as in the Origin.
In the case of the multiprocessor system for performing the aforementioned directory type cache coherency control, in general, there is a problem that the frequency of transmission between LSIs is increased by the frequency of directory reference, thus increasing the memory latency.
An increase in the amount or size of main memory also causes an increase in the amount of directory. Accordingly a large capacity of main memory to be mounted requires a large capacity of directory memory, thus disadvantageously increasing involving high costs.
It is therefore an object of the present invention to provide a node controller which can eliminate the above problems in the prior art and also to provide a multiprocessor system of a main-memory shared type using such a node controller.
Another object of the present invention is to provide a node controller which can use nodes common to both a small multiprocessor system having a small number of nodes and a large multiprocessor system having a large number of nodes using a crossbar, and also to provide a multiprocessor system of a main memory shared type which uses such a node controller.
A further object of the present invention is to provide a node controller which can reduce development costs by using nodes common to both a small system having a small number of nodes and a large system having a large number of nodes as in the aforementioned Origin, and also to provide a multiprocessor system of a main memory shared type using such a node controller.
Yet another object of the present invention is to provide a node controller which allows a plurality of nodes to be directly connected to form a small system and can omit an external crossbar, and also to provide a multiprocessor system of a main memory shared type using such a node controller.
A still further object of the present invention is to provide a node controller which can reduce an increase in memory latency caused by an overhead of directory reference by employing a cache coherency control system not using directory and can avoid increase of costs of devices other than a main memory even when the size of the main memory is increased, and also to provide a multiprocessor system of a main memory shared type using such a node controller.
In accordance with an aspect of the present invention, in order to attain the above objects, there is provided a multiprocessor system of a main memory shared type having a plurality of nodes mutually connected by signal lines, each of the plurality of nodes including:
a CPU having a cache memory;
a main memory; and
a node controller for performing communication control between the CPU, main memory and the other nodes than its own node,
the node controller having:
a communication controller for controlling communication interface between the plurality of nodes;
a crossbar for determining a processing sequence of memory access requests to the main memories in the plurality of nodes issued from at least one of the plurality of nodes; and
a crossbar controller means for validating or invalidating the crossbar.
With such an arrangement, nodes having common structures can be used in both a small multiprocessor system having a small number of nodes and a large multiprocessor system having a large number of nodes, thus eliminating the need for developing nodes differently for the small and large multiprocessor systems and reducing its development costs. In this case, the crossbar for determining the processing sequence of memory access requests is only required to be validated, while the crossbars for not determining the sequence are only
Akashi Hideya
Hamanaka Naoki
Shonai Toru
Tanaka Tsuyoshi
Tsushima Yuji
Hitachi , Ltd.
Ho Thang
Mattingly Stanger & Malur, P.C.
Portka Gary
LandOfFree
Node controller for performing cache coherence control and... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Node controller for performing cache coherence control and..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Node controller for performing cache coherence control and... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3204970