Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories
Reexamination Certificate
2001-03-30
2003-07-01
Portka, Gary J (Department: 2188)
Electrical computers and digital processing systems: memory
Storage accessing and control
Hierarchical memories
C711S120000, C711S145000
Reexamination Certificate
active
06587922
ABSTRACT:
BACKGROUND OF THE INVENTION
The present invention relates to an effective technique to be applied to a multiprocessor system configuration method for carrying out consistency control of a cache memory and a cache consistency guaranteeing method wherein a multiprocessor system has a plurality of processors and a cache memory per one processor or more, and more particularly, wherein a sharing memory type multiprocessor system has a plurality of nodes which have respective processors and share a memory through a network.
Conventionally, a symmetrical multiprocessor (hereinafter referred to as an SMP) having a plurality of processors where shares a memory space is often used as a computer for simultaneously multi-processing a plurality of processing requests for a sharing resource such as a transaction processing or a large scale database processing. On the other hand, in a recent processor has an operating frequency thereof has a high speed. In order to solve the problem of a deterioration in performance due to an access time of a main storage (hereinafter referred to as a memory) constituted by a DRAM which is an element having a large capacity and a low speed, processors having a cache memory with a small capacity and a high speed are increased. In the SMP constituted by using a plurality of processors having such a cache memory, consistency between cache memories should be guaranteed. In a bus coupling type SMP, for example, there is used such a method that a memory reference request sent from each of processors is monitored by all the other processors and, thereby, consistency between cache memories is guaranteed. The method is referred to as a “snoop bus method” (cited reference 1: see “Parallel Computer Architecture” ISBN 1-55860-343-3, pp 277 to 301).
In such a snoop bus method, memory reference requests are transmitted from all processors through a snoop bus to a memory. Therefore, the snoop bus becomes a bottleneck of a system. As a method for decreasing the number of request issues sent from each of the processors to the snoop bus with a memory access, generally, a “write back method” is used. However, even if the number of processors is to be increased to enhance the performance of the SMP of the snoop bus method, an electrical load to be applied to one bus is increased. Therefore, the maximum number of processors is limited. As a method of further increasing the number of processors, there is often used a “switch coupling type SMP” for coupling each of the processors by means of a cross bus switch or the like in place of the bus. In such a switch coupling type SMP, there is used a “switch broadcasting method” for broadcasting a memory reference request sent from a certain processor through the cross bus switch to all processors in order to take over a feature of the snoop bus, that is, the feature being “all processors monitor a memory reference request sent to a bus” (cited reference 2: see “Parallel Computer Architecture” ISBN 1-55860-343-3, pp 555 to 556).
On the other hand, an I/O device such as a disk device or a network interface, and a processor share a memory, thereby exchanging data. For example, in the case in which a file is to be read from the disk device, the processor addresses a memory (referred to as a buffer) for storing the data read out and activates a DMA write for the disk device. The disk device reads a file recorded in a disk and writes data to the addressed buffer. At this time, if the consistency guarantee of a processor cache is not carried out for data write from the disk device, the processor refers to old data in the cache memory despite the update of contents of the memory through the disk device. As a method for solving this problem, for example, there is used a “snoop type coherent I/O method” applying the above-mentioned “snoop bus method” to a memory access sent from the I/O device, or an “explicit flash method” for explicitly flashing the contents of the processor cache before the processor carries out DMA activation for the I/O device (cited reference 3: see U.S. Pat. No. 4,713,755 “Cache Memory Consistency Control with Explicit Software Instructions”.
SUMMARY OF THE INVENTION
In the SMP using the switch broadcasting method described above, however, the following problems arise from the application of the snoop type coherent I/O method. In the switch broadcasting method, a memory reference request sent from the I/O device must be broadcast to all processors by means of a switch in order to guarantee the cache consistency of all the processors in the switch broadcasting method. However, the broadcast of the I/O device through the memory reference request disturbs the memory reference request of the processor. Therefore, the memory reference of the processor is delayed so that there is the drawback that whole performance thereof decreases. Moreover, a cache becomes busy due to the execution of consistency guarantee check of the caches of all the processors through the broadcast. Consequently, a cache access sent from each of the processors is inhibited so that there is the drawback that a cache access latency thereof increases.
Furthermore, in the case in which the “explicit flash method” is to be applied, it is considered that the following problems arise. The explicit flash method utilizes the feature, “a buffer region which an I/O device accesses is defined before DMA activation is carried out in a processor”, and, in order to previously guarantee that a copy in the buffer region is not present in all caches, broadcasts a flash request to all processors through a switch only in this buffer region. In the processor receiving the flash request, if the state of the cache is “updated”, the newest contents are written back to the memory and the cache is set to be “invalid” because the contents of the cache is the newest. If the state of the cache is not “updated”, the cache is simply “invalidated”. Referring to the DMA access sent from the I/O device, consequently, it is not necessary to carry out the broadcast for the consistency guarantee of the cache. In the present method, however, it is necessary to successively execute the explicit flash and the memory access through the I/O device. For this reason, there is the drawback that file access time is prolonged and system performance thereof accesses, for example.
Therefore, an object of the present invention is to provide a multiprocessor system capable of reducing a broadcast for cache consistency control for a memory access sent from an I/O device and implementing a high-speed I/O processing. In order to achieve the object, the present invention has a first problem to reduce a broadcast for cache consistency control related to a memory read request sent from an I/O device. Furthermore, the present invention has a second problem to reduce a broadcast for cache consistency guarantee related to a memory write request sent from the I/O device.
The above and other objects and novel features of the present invention will be apparent from the description and accompanying drawings in this specification.
The summary of the typical invention disclosed in the present application will be briefly described below.
In order to attain the first object, a multiprocessor of the present invention comprises a first means for recording one of both an identifier of the cache memory if the cache memory has an exclusive copy of a memory location capable of being cached and the report that no cache memory, otherwise, has the exclusive copy, wherein when one of the processor and the I/O device issues a read request for the memory location capable of being cached, the first means carries out one of: a first step of, if the identifier is recorded, transmitting a message for determining whether or not only the cache memory with the exclusive copy has a “updated” copy, and carrying out one of, when the cache memory with the exclusive copy has a “updated” copy, supplying data from the cache memory with the exclusive copy and of, otherwise, reading data from the memory; a second step of, if the report is recorded,
Hamanaka Naoki
Higuchi Tatsuo
Kawamoto Shinichi
LandOfFree
Multiprocessor system does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Multiprocessor system, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Multiprocessor system will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3079540