Shared memory multiprocessor performing cache coherency

Electrical computers and digital processing systems: memory – Storage accessing and control – Shared memory area

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Shared memory multiprocessor performing cache coherency Shared memory multiprocessor performing cache coherency

: 2000-02-18
: 2003-04-08
: Nguyen, T. V. (Department: 2187)
: Electrical computers and digital processing systems: memory
: Storage accessing and control
: Shared memory area

: C711S141000, C711S147000, C711S149000, C711S169000
: Reexamination Certificate
: active
: 06546471
: ABSTRACT:

BACKGROUND OF THE INVENTION
The present invention relates to a parallel computer system of shared memory type which is used for information processors, especially personal computers (PCs), workstations (WSs), server machines, etc., and more particularly to a control method for a main memory.
In recent years, the architecture of a multiprocessor of the shared memory type (SMP) has spread to use in host models of PCs and WSs, server machines, etc. This architecture has become an important feature for the enhancement of performance in shared-memory multiprocessors that share main memory, for example among multiprocessors having a large number, such 20~30, processors.
Extensively used as a method of constructing a shared memory multiprocessor is a shared bus scheme. With the bus scheme, however, the throughput of the bus causes a bottleneck, and hence, the number of connectable processors is limited at most 8 or so. Accordingly, the bus scheme is not suitable as a method of connecting a large number of processors.
Conventional methods of constructing shared memory multiprocessors each having a large number of processors connected therein are broadly classified into two schemes.
One of them is crossbar switch architecture, and it is disclosed in, for example, “Evolved System Architecture” (Sun World, January 1996, pp. 29-32). With this scheme, boards each of which has a processor and a main memory, are connected by a high speed crossbar switch so as to maintain the cache coherency among the processors. This scheme has the merit that the cache coherency can be rapidly maintained.
The scheme, however, has the demerit that, since a transaction for maintaining the cache coherency is broadcast to all of the processors, traffics on crossbar switch is very high and causes a bottleneck in performance. Another demerit is that, since the high speed switch is required, a high cost is incurred. Further, since the transaction for maintaining the cache coherency must be broadcast, it is difficult to realize a system having a very large number of processors, and the number of processors is limited to ten to twenty.
In the ensuing description, this scheme shall be called the switch type SMP (Symmetrical MultiProcessor).
The other scheme provides a multiprocessor employing a directory based protocol, and it is disclosed in, for example, “The Stanford FLASH Multiprocessor” (The 21st Annual International Symposium on COMPUTER ARCHITECTURE, Apr. 18-21, 1994, Chicago, Ill., pp. 302-313). With this scheme, a directory, which is a bitmap indicative of those caches of processors to which the data line is cached, is provided for every data line of the main memory, whereby a transaction for maintaining the cache coherency among the processors is sent only to the pertinent processors. Thus, traffics on switch can be noticeably reduced, and the hardware cost of the switch can be curtailed.
Since, however, the contents of the directory placed in the main memory must be inevitably checked in submitting the transaction for maintaining cache coherency, the scheme has the demerit that an access latency is lengthened. Further, the scheme has the demerit that the cost of the memory for placing the directory increases additionally.
As stated above, the switch type SMP and the directory based protocol have both the merits and the demerits. In general, with the switch type SMP, a hardware scale becomes larger, and a scalability in the case of an increased number of processors is inferior, but a higher performance can be achieved. Accordingly, a system in which the number of PCs, server machines, etc. is not very large (up to about 30) should more advisably be realized by using the switch type SMP.
Another problem involved in constructing a shared memory multiprocessor is the problem of reliability. Each of the shared memory multiprocessors in the prior art has a single OS (Operating System) as the whole system. This method can manage all the processors in the system with the single OS, and therefore has the advantage that a flexible system operation (such as load balancing) can be achieved. In the case of connecting a large number of processors by the shared-memory multiprocessor architecture, however, this method has the disadvantage that the reliability of the system degrades.
In a server of cluster system wherein a plurality of processors are connected by a network or in MPPs (Massively Parallel Processors), individual nodes have different OSs, so that even when a system crash occurs on one node because of, for example, OS bug, the system is down only at the corresponding node. In contrast, in the case of controlling the whole shared-memory multiprocessor system by the single OS, when system crash occurs on a certain processor because of a system bug or the like, the OS itself goes down, and hence, all the other processors are affected.
A method wherein a plurality of OSs are run in the shared memory multiprocessor for the purpose of avoiding the above problem, is disclosed in “Hive: Fault Containment for Shared-Memory Multiprocessors” (15th ACM Symposium on Operating Systems Principles, Dec. 3-6, 1995, Copper Mountain Resort, Colo., pp. 12-25).
With this method, the shared memory multiprocessor conforming to the directory based protocol is endowed with the following two facilities:
(1) The whole system is divided into a plurality of cells (partitions), and independent OSs are run in the respective partitions. The system has a single address space, and the respective OSs take charge of different address ranges.
(2) A bitmap which expresses write accessible processors is provided every page of the main memory, and write access is allowed only for the processors each having a value of “1” in the bitmap.
More specifically, in a case where data is to be written into the main memory of each processor (in a case where the data is to be cached in compliance with a “Fetch & Invalidate” request, or in a case where a “Write Back” request has arrived), the contents of the bitmap are checked, and only the access from the processor having the value of “1” in the bitmap is allowed.
Owing to the above facility (1), even when the OS of any partition has crashed, it is possible to avoid the other partitions going down. Further, owing to the provision of the facility (2), the processor of the partition having crashed due to a bug can be prevented from destroying data which the other partitions use.
As thus far explained, the reliability of the system can be sharply enhanced by dividing the interior of the shared memory multiprocessor into the plurality of partitions.
SUMMARY OF THE INVENTION
In the case of constructing a switch type SMP and further dividing the interior of the SMP into partitions, as stated in the Prior Art, there are three problems to be mentioned below.
(A) Slow Access to Local Main Memory
In a case where the processor accesses the main memory included in the same board, ideally it ought to be accessible at high speed without passing through the crossbar switch.
In actuality, however, the transaction for maintaining the cache coherency must be submitted to the other processors so as to check the caches of the other processors (hereinbelow, this processing shall be called the “CCC: Cache Coherent Check”). This is because there is a possibility that the copy of the accessed data has been buffered in the cache of another processor.
In the case where the data has been actually buffered in the cache of any other processor, the CCC is required. However, in a case where the accessed data is local data having never been accessed from any other processor, there is no possibility that the corresponding data has been buffered in the cache of any other processor, CCC could be omitted.
Therefore, the wasteful CCC incurs, not only the drawback that the access latency is prolonged, but also the drawback that the traffic in the switch is enlarged.
In the directory based protocol, on the other hand, the wasteful CCC does not occur because directory makes it possible to tell which processors have a copy of data line in the cache. A

Affiliated with

Akashi Hideya

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Okada Yasuyuki

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Okazawa Koichi

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Okochi Toshio

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Shonai Toru

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Hitachi , Ltd.

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

Mattingly Stanger & Malur, P.C.

Law Firm

[ 0.00 ] – not rated yet Voters 0 Comments 0

Nguyen T. V.

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Shared memory multiprocessor performing cache coherency does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Shared memory multiprocessor performing cache coherency, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Shared memory multiprocessor performing cache coherency will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-3078248

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure