Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories
Reexamination Certificate
2001-12-21
2004-07-06
Ellis, Kevin L. (Department: 2186)
Electrical computers and digital processing systems: memory
Storage accessing and control
Hierarchical memories
Reexamination Certificate
active
06760810
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention generally relates to a data processor, and specifically relates to a data possessor such as a microprocessor and an image processor that include an instruction cache.
2. Description of the Related Art
Conventionally, various processors take in an instruction from an external memory (RAM), and execute the instruction by an execution unit.
FIG. 1
is a block diagram showing this kind of microprocessors. A microprocessor
10
has an execution unit
11
. The execution unit
11
executes an instruction stored in an external RAM
12
, which functions as an external memory, by the following procedure. First, the execution unit
11
outputs an instruction address to the external RAM
12
(step 1), and receives a corresponding instruction (step 2). Then, the execution unit
11
analyzes and executes the instruction (step 3). In that event, the execution unit
11
outputs the data address to the external RAM
12
(step 4) in order to read and write data, and reads and writes the data (step 5). Here, the operation in the steps 4 and 5 may be omitted depending on instructions.
With the configuration of
FIG. 1
, it is necessary to access the external RAM
12
every time an instruction is executed, causing a problem that the execution of the instruction takes time.
In order to solve this problem, practices have been to provide an instruction cache
13
in a microprocessor
10
A as shown in FIG.
2
. When the instruction cache
13
does not contain an instruction required, the instruction is read from the external RAM
12
according to the procedure of steps
1
and
2
and supplied to the execution unit
11
, and the instruction is stored in the instruction cache
13
. When the execution unit
11
requires the same instruction afterwards, the corresponding instruction is read from the instruction cache
13
which received the instruction address, and the instruction is supplied to the execution unit
11
. Since the time to access the instruction cache
13
is generally shorter than time to access the external RAM
12
, time until an instruction is read and executed can be shortened.
FIG. 3
is a block diagram showing configuration of the instruction cache
13
shown in FIG.
2
. The instruction cache
13
has an instruction address register
14
, two units of tag RAM
15
and
16
, two units of cache RAM
17
and
18
,
2
comparators
19
and
20
, a hit/miss checking logic circuit
21
, and a selector
22
. The tag RAM
15
and the cache RAM
17
are interlocking (system #0), and the tag RAM
16
and the cache RAM
18
are interlocking (system #1).
The instruction cache
13
receives an instruction address from the execution unit
11
shown in
FIG. 2
, and outputs a corresponding instruction through the selector
22
. The instruction address is sent to the external RAM
12
, and a corresponding block is received from the external RAM. A block is a group of a plurality of instructions specified by continuous addresses.
FIG. 4
shows instructions that are executed sequentially. In
FIG. 4
, the instructions are specified by continuous instruction addresses except for the branch instruction (branch). The instructions are executed in the order shown by the arrow on the right-hand side of FIG.
4
. The four instructions, for example, specified by the continuous addresses are considered as a block.
The instruction address register
14
of
FIG. 3
is divided into areas of a block offset, a line address, and a tag address. Two cache RAMs
17
and
18
are accessed by the line address and the block address, and output a specified instruction. The line address is used in order to limit an area in the cache RAMs
17
and
18
wherein instructions from the external RAM
12
are to be stored. For example, an instruction stored in the addresses xxxx and yyyy of the external RAM
12
is stored in zzz of the cache RAM
17
or
18
. If the instruction is allowed to be stored in an arbitrary storage area of the cache RAM
17
or
18
, accessing the cache RAM
17
and
18
will take time.
Here, the instruction read from the external RAM
12
can be stored in the two cache RAMs
17
and
18
. In this case, it is said that the degree of association is 2. The cache RAMs
17
and
18
may be configured by discrete memory chips, or by splitting a storage area of one memory chip.
The block offset specifies an instruction within a block from a line address. For example, an “add” instruction to add in the first line of
FIG. 4
is specified by the line address, and the instructions of “add”, “subcc”, “or”, and “set” are specified by changing the block offset from “00” to “01”, “10”, and “11.”
The tag RAMs
15
and
16
output a tag address in accordance with the line address. Comparators
19
and
20
compare the tag addresses read from the tag RAMs
15
and
16
, respectively, with the tag address read from the instruction address register
14
to determine whether they match. When an instruction specified by the line address is stored in the cache RAM
17
, the comparison result of the comparator
19
is a match (cache hit). To the contrary, when the instruction specified in the line address is stored in the cache RAM
18
, the comparison result of the comparator
20
is a match (cache hit).
The hit/miss checking logic circuit
21
controls the selector
22
according to an output of the comparators
19
and
20
. If the comparator
19
outputs a match signal, the selector
22
will select the cache RAM
17
, and if the comparator
20
outputs a match signal, the selector
22
will select the cache RAM
18
. The selected instruction is supplied to the execution unit
11
.
FIG. 5
shows the above-described process where the tag address read from the tag RAM
15
and the tag address read from the instruction address register
14
are in agreement. In the drawing, thick lines indicate flows of the address, the instruction, and a signal and the like used in the read-out operation.
FIG. 6
shows a case where comparison results of both comparators
19
and
20
were negative (cache miss). In the drawing, thick lines indicate flows of the address, the instruction, and the signal used in write-in operation. In this case, the instruction is read from the external RAM
12
and is written into the cache RAM
17
or the cache RAM
18
.
FIG. 6
shows an example in which the instruction read is written into the cache RAM
17
. Further, the tag address of the instruction address that was missed is written in the tag RAM
15
that corresponds to the cache RAM
17
. Further, the instruction stored in the cache RAM
17
is read, and supplied to the execution unit
11
through the selector
22
.
However, there is a problem in the conventional instruction cache described above.
FIG. 7
shows a sequence of instruction reading from the instruction cache
13
configured as shown in FIG.
3
. In order to clearly illustrate flows of an address and the like, some of the reference numbers given to the components shown in
FIG. 3
are omitted. In
FIG. 7
, one instruction is made of 4 bytes and 1 block is made of four instructions (that is, 1 block includes 16 bytes). Moreover, the number of lines is 128. The read-out sequence starts at a step (a) and ends with a step (e).
Suppose that an instruction address of “0×00000000” is supplied from the execution unit
11
, and stored into the instruction address register
14
. In this case, the line address is “0000000” and the block offset is “00.” At the step (a), it is assumed that the tag address of the instruction address is the same as the tag address read from the tag RAM
15
. Therefore, the hit/miss checking logic circuit
21
selects the cache RAM
17
by controlling the selector
22
. For example, the addition instruction “add” of
FIG. 4
is read from the cache RAM
17
.
Next, the instruction address “0×00000004” is stored in the instruction address register
14
in the step (b). In this case, the block offset is incremented by one from “00”, and it is set to “01”. Since the line address
Satoh Taizoh
Utsumi Hiroyuki
Yamazaki Yasuhiro
Yoda Hitoshi
Ellis Kevin L.
Fujitsu Limited
Staas & Halsey , LLP
LandOfFree
Data processor having instruction cache with low power... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Data processor having instruction cache with low power..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Data processor having instruction cache with low power... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3204047