Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories
Reexamination Certificate
2001-05-25
2003-07-01
Ellis, Kevin L. (Department: 2185)
Electrical computers and digital processing systems: memory
Storage accessing and control
Hierarchical memories
C711S120000
Reexamination Certificate
active
06587927
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates to a data processor having a cache memory, and more particularly to a software prefetch for efficiently using two types of cache memories and set associative control for most favorably controlling the access of the set associative cache memories. Moreover, the present invention relates to a data processor having a controller for these operations.
BACKGROUND OF THE INVENTION
In general, a computer having a cache memory stores data to be frequently used in a small-capacity high-speed cache memory as a copy of part of the data stored in a large-capacity low-speed main memory, so that an instruction unit, such as a CPU, may make a high-speed data access to the cache memory for frequently used data and accesses to the main memory only when the desired data is not present in the cache memory.
However, because the machine cycle of the CPU is significantly shorter compared with that of the main memory, the penalty in the case of a cache miss (the time until requested data is obtained from the main memory) increases.
A method called software prefetch for solving the above problem is described in David Callhan et al., “Software Prefetching” Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 1991, pp. 40-52. In the method described in this first publication, an address is computed by a prefetch instruction before an instruction unit requires data, the address is checked to see if data indicated by the address is present in the cache memory, and if not, the data is transferred from the main memory to the cache memory. Therefore, it is possible to improve the hit ratio of the cache memory and minimize the penalty because data is previously stored in the cache memory by the prefetch instruction whenever data is required.
A cache memory comprising two buffers with different purposes, which are properly used by hardware is disclosed in Japanese Patent Laid-Open No. 303248/1992
In this second publication, the cache memory has an S buffer and a P buffer. The S buffer stores data to be accessed frequently over time. The P buffer stores data of which the addresses to be referenced from now on by the program are close to the currently referenced address, i.e. the P buffer stores the array data to be accessed in the array computation. Either one of the two buffers may be used selectively depending on the addressing mode in effect and on the type of register being used for the address calculation.
In general, a computer stores instructions or data to be frequently called and processed by a processor in a high-speed small-capacity memory, called a cache memory, as a copy of part of the instructions or data stored in a comparatively low-speed large-capacity main memory. Thus, the computer operation speed is increased. A data access system for such a cache memory includes a direct-mapped memory and a set associative memory.
The direct mapping system is used for accessing a cache memory by directly outputting data or an instruction stored in an address designated by a processor or the like and storing it in the designated address.
The set associative memory is used for accessing a plurality of sets of data values or a plurality of instructions (called a data set) in a cache memory having a plurality of sets, each of which comprises a plurality of memories common in allocation of addresses. A plurality of accessed sets of data values or a plurality of accessed instructions required are selected and processed in the processor.
FIG. 17
shows a schematic view of a data processor having a two-set associative cache memory according to a o third conventional arrangement. In
FIG. 17
, symbol
9201
represents a CPU,
9202
to
9217
represent 8-bit output universal memories,
9218
represents an address bus,
9219
represents a 64-bit data bus of a first set, and
9220
represents a 64-bit data bus of a second set. The universal memories are used as data arrays of the two-set associative cache memory. The memories
9202
to
9209
are used as the data array of the first set and the memories
9210
to
9217
are used as the data array of the second set.
When an address designated by the CPU is sent to memories through the address bus, two sets of data values each having a width of 64 bits are outputted to the CPU through a respective data bus.
To constitute a set associative cache memory having m sets of data values with the width of n bits by using k-bit output memories, “n×m/k” memory chips are necessary in general. In the case of the above-described third conventional arrangement, 16 memories are necessary because n equals 64, m equals 2, and k equals 8.
The method described in first publication has the problem that an expensive two-port cache memory must be used in order to process transfer of data from the main memory to the cache memory and a memory referencing instruction sent from the instruction unit at the same time. Unless simultaneous processing is carried out, it is possible to use a generally-used one-port cache memory. In this case, however, a lot of processing time is required and the feature of software prefetch cannot effectively be used.
Moreover, the method described in the first publication has the additional problem that, when data, which is read from a cache memory only once and is immediately expelled from the cache memory, is held in the cache memory, the cache memory is filled with useless data and the hit ratio decreases.
These problems frequently occur in a program for handling large-scale data exceeding the capacity of a cache memory.
The arrangement described in the second publication has the problem that, because a cache memory for storing data between two cache memories is determined by an address designation system and a register used for address computation, two cache memories must properly be used for considering data characteristics including data size.
It is the first object of the present invention to provide a data processor for solving the above problems, which is capable of quickly and efficiently processing small-capacity frequently accessed data stored in a cache memory and large-scale data exceeding the capacity of the cache memory, and which is also capable of lessening the contamination of the cache memory and improving the hit ratio.
The third conventional arrangement described with reference to
FIG. 17
has a problem that, when the number of sets of set associative cache memories increases, or the data bit width increases and the number of memories for constituting the cache memories increases, the cache memory cost increases.
When the number of memories increases, problems occur in that the address bus fan-out, address bus length, and data bus length increase, the cache memory access time increases, and the machine cycle of the entire data processor cannot be shortened.
When the number of sets increases, problems occur in that a number of data buses equivalent to the number of sets is required and the number of pins of the CPU increases. That is, a problem occurs in that it is impossible to meet the restriction on the number of pins of a package in the case of one chip.
It is the second object of the present invention to provide a set associative cache memory comprising a smaller number of memories.
SUMMARY OF THE INVENTION
To achieve the above first object, the present invention involves the use of a first cache memory with a large capacity and one port and a second cache memory with a small capacity and two ports disposed between a main memory and an instruction processing section, and a control section controlled by a prefetch instruction to store data to be frequently accessed in the first cache memory and data to be less frequently accessed in the second cache memory.
Because data to be frequently accessed is stored in the first cache memory, the hit ratio is improved. Moreover, because data to be less frequently accessed is not stored in the first cache memory, the storing of useless data in the first cache memory can be lessened.
Because dat
Hotta Takashi
Kurihara Toshihiko
Osumi Akiyoshi
Saito Koji
Sawamoto Hideo
Antonelli Terry Stout & Kraus LLP
Ellis Kevin L.
LandOfFree
Data processor having cache memory does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Data processor having cache memory, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Data processor having cache memory will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3107008