Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories
Reexamination Certificate
2000-03-03
2003-11-18
Nguyen, T. V. (Department: 2187)
Electrical computers and digital processing systems: memory
Storage accessing and control
Hierarchical memories
C711S113000, C711S130000, C711S131000, C711S147000, C711S148000, C711S150000
Reexamination Certificate
active
06651139
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The invention relates to a multiprocessor system that includes plural processors and a shared memory shared by the plural processors.
2. Description of the Related Art
Although the processor has made rapid progress in its processing speed accompanied with the recent technological improvement regarding the processor, a memory or bus has made slow improvement in its operation speed compared with the performance improvement of the processor, which leads to the problem of a data transfer rate between the processor and the memory, thus causing influence over the performance of the whole computer system. On the other hand, along with increased demand for a higher speed and more functions in recent computer systems, a multiprocessor system has become essential as the system configuration. In the multiprocessor system, processes are executed in parallel with data communication between the processors. A technique that employs a shared memory referred by all the processors for the data communication makes the system configuration comparably simple, and it is widely used. A document titled: “Parallel Computers”, written by Hideharu Amano in 1996, published by SHOKODO describes in detail the multiprocessor system that employs a shared memory. In this system, plural processors and shared memories, or other I/O devices are connected to a shared bus. And, the system executes parallel processing, while the plural processors or I/O devices appropriately read and write data in the shared memories. When taking on this type of shared bus configuration, the transfer band width of the bus, or the bus traffic congestion that occurs in the system, or the latency time for a memory access can influence the throughput of the system.
As a method of solving the bottleneck of the bus, the multiprocessor system described in the Japanese Published Unexamined Patent Application No. Hei 3-176754, shown in FIG. 21, can be given. In the multiprocessor system shown in this document, the shared memory is divided into memory modules with different and successive address areas assigned, and plural busses connected to all the processors and any one of the shared memory modules are provided to decentralize access demands from the processors, thereby reducing the bus contention.
Also, as a method of solving the bottleneck of memory access, there is widely used a method which adds high-speed local caches each to processors, and processes as many memory accesses as possible locally between the processors and the caches to thereby reduce the use of the shared bus. With the system thus configured, the probability of needing to access the shared memory that usually takes a long access time will significantly be reduced, and the average latency time of memory access will be improved.
Furthermore, as disclosed in the Japanese Published Unexamined Patent Application No. Hei 8-339353, shown in FIG. 22, a method is proposed which has plural busses of one type to expand the transfer band width of the bus, and accesses the shared memory through plural buffers. This method temporarily writes data in a vacant buffer when writing the data in a memory that takes a long access time, whereby the processor can move on to a next process. Normally, the processor is in a wait state before starting the data write in the memory. But by utilizing plural buffer areas, a system that will not be restricted by the slow processing speed of a memory can be configured.
The buffer given here does not have the same role as the cache memory mentioned in the previous example. By what is written in this published application, the buffer is the place that temporarily stores data to be read and written in the shared memory, and is meant to connect the processors and the memory that have different processing speed. Therefore, the data written in a buffer is used for transmitting from the memory to the processor when the processor demands read, and from the processor to the memory when the processor demands write. In other words, these buffers do not play the role of directly responding to the read demands from the processors as cache memories that store the copies of data in the shared memory. On the other hand, the cache memories differ in that they store copies of data in the shared memory, and are frequently read and written by the processors. Generally, the cache memory often adopts a memory with an access time that matches the speed of the processor.
However, either method has problems that will be mentioned hereafter. As in the example shown in the Japanese Published Unexamined Patent Application No. Hei 3-176754 (refer to FIG. 21), with the method of having the plural busses, the area to mount the busses will increase as the number of the busses increases. The influence that the number of the pins of ICs connected to the busses causes over the operation speed is enormous, and there is a tendency that the operation speed slows down as the number of the pins increases, or the mounting or the designing process becomes troublesome. Also, when the operation speed of the busses becomes higher, the EMC noise caused by the electromagnetic radiation and the transmission delay cannot be ignored. Thus a new problem is caused.
Further, there is a problem caused by adding local caches. The problem will be explained with reference to FIG. 23. FIG. 23 shows a configuration of four processor units
101
a
through
101
d
connected to a shared bus
103
, which enables the processors to access the shared memory
104
. The processor units
101
a
through
101
d
each have local caches
102
a
through
102
d
. Now, suppose that the two processors
101
a
and
101
b
read data of a same address in the shared memory
104
. The read data pieces are each copied to the caches
102
a
and
102
b
of the processors
101
a
and
101
b
. Next, suppose that the processor
101
a
rewrites the cache data with a certain calculation. That is, this represents that the original data of that address in the shared memory
104
has been rewritten. When this happens, the data that the processor
101
b
has read and stored in the cache
102
b
is no longer the correct data. Therefore, in order for the processor
101
b
to reread the data of the same address, the processor
101
b
has to read the data again from the shared memory
104
. In other words, when the processor
101
a
has rewritten the cache data, this information must be posted to the other caches, and the other caches have to undergo the procedure of deleting the data. Various protocols for the maintenance of coherency of this type of the caches are proposed. In relation to this subject, the document titled: “An Implementation of a Shared Memory Multiprocesser”,(written by Norihisa Suzuki, Shigenori Shimizu, and Nagatugu Yamauchi, in 1993, published by CORONA PUBLISHING CO., LTD.) can be given as an example with detailed description. In general, either the directory method, which holds the table that records the status of the cache data and controls the correspondence by referring to that table, or the snoop cache method, which monitors all the memory accesses that go through the shared bus and controls the local caches as needed, is employed. However, not only do these methods require very complicated controls but they involve a problem of expanding the mounting area of the hardware.
Furthermore, when having plural data busses for solving the bottleneck of the bus, the process becomes more complicated. In both the directory method and the snoop cache method, all the memory accesses must be monitored in order to precisely understand the status of the cache. When there is only one path for the data, it is only needed that the memory access information running through the path is monitored, but when there are several of them, all of them have to be monitored and the consistency between the memory accesses must be maintained. When there is only one bus, other processors cannot make access to the same address simultaneously because the bus is occupied, but when there are several
Fujimagari Hiroshi
Funada Masao
Hamada Tsutomu
Kamimura Takeshi
Kobayashi Ken-ichi
Fuji 'Xerox Co., Ltd.
Nguyen T. V.
LandOfFree
Multiprocessor system does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Multiprocessor system, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Multiprocessor system will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3172431