Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories
Reexamination Certificate
1999-06-14
2002-08-20
Kim, Matthew (Department: 2186)
Electrical computers and digital processing systems: memory
Storage accessing and control
Hierarchical memories
C711S154000, C711S145000
Reexamination Certificate
active
06438653
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a cache memory in a computing system, and more particularly to the configuration of a cache memory added to a processor and a cache memory provided external to a processor in a computing system.
2. Description of the Related Art
The performance of microprocessors has been improved every year in consequence of improved frequencies resulting from miniaturized design processes for large-scaled integrated circuits and an enhanced processing efficiency resulting from improved schemes. With this improvement, the requirement of the improved access throughput and reduced access latency of the memory system connected to the microprocessor is increasing.
A method using cache memory has become common as a method of improving the performance of the memory system.
A cache memory is one type of memory which has a high access throughput and a short access latency although it has a small capacity as compared with a main memory. A cache memory may be disposed between a processor and a main memory to temporarily hold therein a portion of the contents of the main memory. During accesses to the memories by the processor, data held in the cache memory is supplied therefrom to enable the data to be supplied with a higher throughput and a lower latency, as compared with data supplied from the main memory. As the capacity of the cache memory is increased, target data for a memory access issued by the processor is more likely to exist on the cache memory (called “cache hit”), thus permitting an improvement in averaged access throughput and a reduction in averaged access latency.
Recent processors often have a hierarchical cache configuration which has both an internal cache that exhibits a high performance but has a small capacity and an external cache that is inferior to the internal cache in performance but has a larger capacity. The internal cache is provided in the same integrated circuit as the processor. For this reason, it can operate at a high frequency and can have a plurality of access ports, so that it can offer a higher throughput and a lower latency than the external cache. However, due to a limitation to the amount of circuits accommodated in an integrated circuit, an internal cache having a large capacity is difficult to implement. The external cache, on the other hand, is composed of dedicated or general-purpose memory devices, and is connected to an integrated circuit which serves as a processor or a cache controller. Thus, it is possible to implement a large capacity of cache as compared with the internal cache. However, in the signal lines connected to the external of the integrated circuits, the operating frequency therein becomes lower than that in the integrated circuit and the number thereof is limited, so that the throughput becomes lower as compared with the internal cache. In addition, it takes more time to transmit and receive signals to and from the integrated circuits, resulting in a longer latency. Thus, the provision of both the internal cache and the external cache can mutually complement their respective shortcomings. In recent years, a scheme having a more number of hierarchical levels has also been in practical use.
For obtaining a computing performance which cannot be realized by a single processor, a multi-processor system in which a plurality of the aforementioned processors are connected through a bus or network may be built.
In the multi-processor system, the plurality of processors access to the common memory (shared memory) to progress the processing. In such a system, when a certain processor issues a memory access, it is necessary to ensure the consistency of caches in all the processors by checking whether the most recent data exists on each of the caches in all of the remaining processors. This processing is called “cache coherence control (snoop).”
In this way, in the multi-processor system, a cache tag, which holds information on data stored in the cache memory, is accessed both for the memory access from the processor and for the cache coherence control from the remaining processors.
In JP-A-5-257809 (prior art 1), there is described a method in which an external cache is configured in accordance with a direct map scheme as a method of connecting a processor having a hierarchical cache to a system which couples a plurality of processors, and a cache tag MTAG for the external cache and a differential tag PTAG for storing differentials which are produced by excluding information included in the cache tag MTAG for the external cache from cache tag information for an internal cache are provided in the outside of the processor. In this example, MTAG and PTAG are simultaneously checked during the cache coherence control from the remaining processors to simultaneously determine the need for the cache coherence control for the external cache and the internal cache.
In the fifth embodiment of JP-A-4-230549 (prior art 2), there are described a method in which a cache tag DL2 which is substantially identical to a cache tag for an external cache of the processor (called “directory”) is provided as a method of coupling each of the processors to a system bus which interconnects a plurality of processors, and DL2 is first checked during the cache coherence control from the remaining processors and then the cache tag for the external cache is accessed only when it is determined that the cache coherence control for the external cache is required. In this example, the cache tag for the external cache and the separately provided cache tag DL2 have substantially the same capacity.
In the multi-processor system, the cache tag is accessed both for the memory access from the processor and for the cache coherence control from the remaining processors. Particularly, as the multi-processor system has a larger number of processors, the amount of the cache coherence control requests from the remaining processors is increased. For this reason, a method of improving the access performance of the cache tag is required. On the other hand, since the external cache memory has a large capacity, it is difficult to implement it in the same integrated circuit as the processor or the cache controller.
While it may be possible to implement only the external cache tag in the same integrated circuit as the processor or the cache controller to improve the access performance of the cache tag, a limited capacity of the external cache tag would limit the capacity of the external cache memory. Thus, it is not suitable for a multi-processor system which runs a large scaled program.
With the cache memory configuration described in the prior art 1, the external cache tag is accessed each time the processor issues the memory access and each time the remaining processors perform the cache coherence control.
When an external cache memory having a large capacity is implemented, an external cache tag is configured as a memory external to a cache controller, thus causing difficulties in realizing a higher throughput and a lower latency. With the cache memory configuration described in the prior art 2, the external cache tag is accessed when the processor issues the memory access, while the cache tag DL2 having substantially the same contents as the external cache tag is accessed when the remaining processors perform the cache coherence control. This does increase the processing throughput of the cache tag in double. However, when the external cache memory having a large capacity is implemented, the external cache tag and the cache tag DL2 are respectively configured as memories external to the cache controller, thereby failing to reduce the latency and increasing the number of signal lines between the cache controller and the cache tags approximately in double.
SUMMARY OF THE INVENTION
It is therefore an object of the present invention to improve the throughput of a system by reducing the frequency of accesses to a cache tag memory to enable more cache tag memory accesses than the prior arts.
To achieve the above object,
Akashi Hideya
Kashiyama Masamori
Okochi Toshio
Shonai Toru
Antonelli Terry Stout & Kraus LLP
Bataille Pierre Michel
Hitachi , Ltd.
Kim Matthew
LandOfFree
Cache memory control circuit including summarized cache tag... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Cache memory control circuit including summarized cache tag..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Cache memory control circuit including summarized cache tag... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2934386