Electrical computers and digital processing systems: processing – Instruction fetching – Prefetching
Reexamination Certificate
2001-01-25
2004-03-16
Kim, Kenneth S. (Department: 2181)
Electrical computers and digital processing systems: processing
Instruction fetching
Prefetching
C712S205000, C711S109000, C711S213000, C711S220000
Reexamination Certificate
active
06708266
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates to an central processing unit that uses address first-out method and has an advance fetching function. This invention also relates to a central processing system which adopts the above-mentioned central processing unit. The central processing unit (hereafter, CPU) conducts control, arithmetic operation and the like based on an instruction and data read from a main storage device.
BACKGROUND OF THE INVENTION
Generally, the processing speed of the CPU is higher than that for reading the instruction and data from the main storage device. Accordingly, a high speed cache is provided between the CPU and the main storage device to store instructions which were previously referred to. Also, in a system which does not comprise a cache, an instruction queue storage section which reads an instruction at an address prior to that of the information read by the CPU is provided.
FIG. 1
 is a block diagram showing the important constituent elements of an ordinary central processing system. This central processing system comprises a CPU 
11
, a cache 
12
, a bus controller 
13
 and a memory 
14
. The CPU 
11
, the cache 
12
 and the bus controller 
13
 are mutually connected through instruction buses 
15
 (for signals IA, IRW, IWAIT and ID). The memory 
14
 is connected to the bus controller 
13
 through a bus 
16
.
The CPU 
11
 outputs an instruction address IA and a read request signal IRW to the cache 
12
 and the bus controller 
13
. Furthermore, the CPU 
11
 receives a wait signal IWAIT or instruction data ID from the cache 
12
 and the bus controller 
13
.
FIG. 2
 is a block diagram showing the important constituent elements of a conventional central processing system which elements relate to the instruction address of the CPU and the cache. Conventionally, the CPU 
11
 latches the instruction address IA output from an address adder (not shown) using a latch 
17
 in the CPU 
11
 at timing at which a clock CLK rises, and outputs the instruction address IA to the cache 
12
. A comparator 
18
 in the cache 
12
 compares the instruction address IA output from the CPU 
11
 with the address of the instruction data ID stored in the cache 
12
. The comparison result COMP generated by the comparator 
18
 is used to generate the wait signal IWAIT.
FIG. 3
 is a block diagram showing the important constituent elements of the central processing system which elements relate to the instruction data of the CPU. Conventionally, the instruction data ID output from the cache 
12
 or the memory 
14
 by way of the bus controller 
13
 to the CPU 
11
 is latched by the instruction queue storage section 
10
 in the CPU 
11
 and then fed to an instruction execution section 
19
. This instruction queue storage section 
10
 is constituted to store only one instruction data ID.
The function of the central processing system shown in FIG. 
2
 and 
FIG. 3
 will now be described. When receiving the read request signal IRW from the CPU 
11
, the cache 
12
 compares the instruction address IA output from the CPU 
11
 with the address of the instruction data ID stored in the cache 
12
. As a result of the comparison, if the addresses are not coincident with each other, the cache 
12
 returns the wait signal IWAIT to the CPU 
11
. When receiving the wait signal IWAIT, the CPU 
11
 waits until the instruction data ID of the requested instruction address IA is read from the memory 
14
 by way of the bus controller 
13
.
Generally, the instruction address IA supplied to the comparator 
18
 of the cache 
12
 delays due to the physical distance between the latch 
17
 of the CPU 
11
 and the comparator 
18
. Besides, in the conventional central processing system stated above, since the instruction address IA is once latched in the CPU 
11
 and then output to the cache 
12
, the cache 
12
 is further delayed in receiving the instruction address IA. Due to this, the cache 
12
 is delayed in returning the wait signal IWAIT to the CPU 
11
 and operation speed is disadvantageously slower, accordingly.
To prevent this, there is proposed increasing cache speed by directly feeding the output of the address adder in the CPU to the bus without temporarily latching the output. This is referred to as an address first-out method. 
FIG. 4
 is a block diagram showing the important constituent elements of the central processing system adopting this address first-out method which elements relate to the instruction address of the CPU and the cache.
In the address first-out method, in the CPU 
21
, an address signal output from the address adder, which is not shown, is output to the cache 
22
 as an instruction address IA before being input into the latch 
27
. The instruction address IA output from the CPU 
21
 is latched by the latch 
29
 at timing at which a clock CLK rises and then supplied to the comparator 
28
 in the cache 
22
.
As can be seen, in the address first-out method, the instruction address IA is output to the cache 
22
 at timing earlier than that of the conventional system. 
FIG. 5
 shows the operation timing of an instruction read request in the address first-out method central processing system, and the operation timing of an instruction read request in the conventional central processing system which does not adopt the address first-out method. In 
FIG. 5
, legends I
1
, I
2
 and I
3
 denote instruction addresses, legend RD denotes a read request signal and legend WAIT denotes a wait signal, which applies to all other figures.
Further, the instruction address IA supplied to the comparator 
28
 is delayed only by as much as the physical distance between the latch 
29
 in the cache 
22
 and the comparator 
28
. Namely, the instruction address IA is not affected by delay due to the physical distance between the CPU 
21
 and the cache 
22
. Therefore, the operation speed of the central processing system is prevented from being slower.
Nevertheless, if a fetch wait signal for suppressing instruction fetch is output inside and fetch operation starts after waiting for the wait to be released in the address first-out method, the reading of the instruction data ID is disadvantageously delayed by one clock compared with the conventional system. 
FIG. 6
 is a specific timing chart. 
FIG. 6
 shows the operation timing of an instruction read request if a fetch wait signal is output in the address first-out method and the conventional method which does not adopt the address first-out method. In 
FIG. 6
, legend IF-WAIT denotes a fetch wait signal for suppressing instruction fetch.
As shown in 
FIG. 6
, in the address first-out method, after the fetch wait is released, the instruction address IA is latched at a clock CLK and then the wait signal IWAIT is returned. Due to this, compared with the conventional method, the wait signal IWAIT is returned at later timing than that for the conventional method by one clock. In the address first-out method, therefore, as indicated by arrows shown in 
FIG. 6
, the instruction data ID is read from the memory later than the conventional method by one clock.
To solve the above-stated disadvantage with the address first-out method, forward fetching might be conducted. Because of the address first-out method, however, a maximum of requests corresponding to access twice, i.e., access during fetching and next first-out address are output to the bus. As a result, a branch instruction cannot be disadvantageously executed without two-bus access.
FIG. 7
 shows the operation timing of an instruction read request if a branch instruction is executed in the address first-out method which adopts advance fetching and in the conventional method which does not adopt the address first-out method. In 
FIG. 7
, legend JUMP denotes a branch signal which indicates that branches exist in the CPU. In other words, it means that the addresses are discontinuous in that portion where the branches exists.
As shown in 
FIG. 7
, in the conventional method, the fetch wait signal IF-WAIT is not output and the branch signal JUMP is output at a clock “5” in FIG. 
7
. The instruction a
Fujitsu Limited
Kim Kenneth S.
LandOfFree
Central processing unit and system counting instructions in... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Central processing unit and system counting instructions in..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Central processing unit and system counting instructions in... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3264952