Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories
Reexamination Certificate
2000-02-25
2002-10-29
Gossage, Glenn (Department: 2187)
Electrical computers and digital processing systems: memory
Storage accessing and control
Hierarchical memories
C711S122000
Reexamination Certificate
active
06473836
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to a cache memory control apparatus for controlling hierarchical cache memories disposed between a main storage unit and a processing unit, which executes various processes using data stored in the main storage unit, and a computer system equipped with the cache memory control apparatus. More particularly, the invention is directed to a cache memory control apparatus for use in a computer system having a prefetch function of fetching speculative data, which is inclined to become necessary in the processing unit, from the main storage unit into the hierarchical cache memories in advance, and also directed to a computer system equipped with the cache memory control apparatus.
2. Description of the Related Art
A conventional computer system (data processing machine), as shown in
FIG. 5
of the accompanying drawings, generally comprises a main storage unit (hereinafter called MSU)
12
storing programs and various data to be processed by the programs, and a central processing unit (hereinafter called CPU)
11
for executing various processes using the data stored in the MSU
12
.
Recently, with increasing improvement of throughput of the CPU
11
and increasing enlargement of capacity of the MSU
12
as well, the data processing speed in the CPU
11
has been much faster as compared to the speed of access to the MSU
12
. Assuming that the CPU
11
and the MSU
12
are combined and are respectively regarded as the data consumption side and the data supply side, shortage of supply of data tends to occur so that the CPU
11
would spend most of the processing time waiting for data from the MSU
12
, lowering the effective throughput of the CPU
11
even though its processing speed is increased.
As a solution, it has been customary to minimize the apparent access time of the MSU
12
, as viewed from the CPU
11
, by placing a cache memory, which is smaller in capacity and higher in processing speed than the MSU
12
, either inside or outside an operationally near the CPU
11
and by using the cache memory to adjust the access delay of the MSU
12
with respect to the cycle time of the CPU
11
.
This cache memory usually assumes only a single level or class (of the hierarchy), in the form of a block of plural words, between the MSU
12
and a register
11
b
in the CPU
12
. Alternatively, however, if the difference between the access time of the MSU
12
and the cycle time of the CPU
11
is considerably large, one or more additional levels or classes (blocks of plural words) are placed between the MSU
12
and the register
11
b
in the CPU
11
. In the example shown in
FIG. 5
, a primary cache memory
13
and a secondary cache memory
14
are placed between the MSU
12
and the register
11
b
, which is coupled to an arithmetic unit
11
a
, in the CPU
11
to form a two-level or a two-class cache memory hierarchy. Both the primary and secondary cache memories
13
,
14
are disposed inside the CPU
11
.
Specifically the primary cache memory
13
is disposed hierarchically near the arithmetic unit
11
a
while the secondary memory
14
is disposed hierarchically near the MSU
12
. Generally the secondary cache memory
14
is set to be larger in storage capacity than the primary cache memory
13
; that is, in a multi-cache-memory hierarchy, the nearer a cache memory is disposed with respect to the arithmetic unit
11
a
, the smaller its storage capacity should be set.
In the computer system equipped with the foregoing cache memories
13
,
14
with the two-level or two-class hierarchy, if the CPU
11
needs a certain kind of data D, first the CPU
11
discriminates whether the data D is stored in the primary cache memory
13
. If the same data D is stored in the primary cache memory
13
(if a “cache hit” results with respect to the primary cache memory
13
), the CPU
11
reads the data D from the primary cache memory
13
without having to access either the secondary cache memory
14
or the MSU
12
.
On the contrary, if the data D is not stored in the primary cache memory
13
(if a “cache miss” results with respect to the primary cache memory
13
), the CPU
11
discriminates whether the data D is stored in the secondary cache memory
14
. As a result, if a cache hit then results with respect to the secondary cache memory
14
(if information retrieval has taken place successfully with respect to the secondary cache memory
14
), the CPU
11
reads a data block containing the data D from the secondary cache memory
14
and then writes the data block into the primary cache memory
13
, whereupon the CPU
11
reads the data D from the primary cache memory
13
.
Further, if the data D is not stored even in the secondary cache memory
14
(if a cache miss results with respect to the secondary cache memory
14
), the CPU
11
reads a data block containing the data D from the MSU
12
and writes the data block into the primary and secondary cache memories
13
,
14
, whereupon the CPU
11
reads the data D from the primary cache memory
13
.
As mentioned above, if a cache miss has resulted with respect to the primary cache memory
13
or the secondary cache memory
14
, the data D must be read from the secondary cache memory
14
or the MSU
12
, respectively, which would take more time to read the data D. In the meantime, although recent computer systems have sharply increased the clock frequency of the CPU
11
, the performance of MSU
12
, such as in the form of DRAM (dynamic random access memory), has not kept up with the improvement in the increased throughput of the CPU
11
. As a result, the MSU
12
would be located far from CPU
11
since as previously mentioned that the difference between the access time of the MSU
12
and the cycle time of the CPU
11
is considerably large the throughput of the CPU
11
would increasingly be impaired due to the foregoing unsuccessful accessing of the cache memories
13
ending.
In order to avoid the penalty for unsuccessful access of the cache memories
13
and
14
, it has been a common practice to fetch necessary data from the MSU
12
into the cache memories
13
,
14
prior to the arithmetic processing.
For this purpose, the CPU
11
issues, in addition to a loading command to fetch data from the MSU
12
into the register
11
b
, a dedicated-to-loading command to fetch the data from the MSU
12
into the primary and secondary cache memories
13
,
14
, but not into the register
11
b
, whereupon the CPU
11
can execute a separate process (the arithmetic process in the illustrated example) without managing or monitoring the state of execution of the dedicated-to-loading command, thus leaving a process or processes associated with the dedicated-to-loading command to the primary and secondary cache memories
13
,
14
. This dedicated-to-loading command is also called a “prefetch” command because of its function.
Now assuming that the arithmetic unit
11
a
performs consecutive arithmetic processes as data of the first to N-th items are substituted for concerned items of a predetermined equation one item in each arithmetic process, the CPU
11
issues a prefetch command to fetch (i+k)th item data to the primary cache memory
13
, prior to execution of the arithmetic process with respect to i-th item data, thereby resulting in the arithmetic unit
11
a
executing the respective arithmetic process without a cache miss.
As a result, the (i+k)th item data, which is inclined to become necessary in a forthcoming arithmetic process succeeding the arithmetic process of the i-th item data (by the arithmetic unit
11
a
) by k steps, is fetched into the primary and secondary cache memories
13
,
14
in parallel to the arithmetic process of the i-th item data. Therefore, by the time the CPU
11
should fetch the (i+k)th item data from the primary cache memory
13
into the register
11
b
for the arithmetic process coming k steps later, the (i+k)th item data will have existed in the primary cache memory
13
so that a cache miss can b
Fujitsu Limited
Gossage Glenn
Staas & Halsey , LLP
LandOfFree
Computing system and cache memory control apparatus... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Computing system and cache memory control apparatus..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Computing system and cache memory control apparatus... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2973588