Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories
Reexamination Certificate
2000-12-22
2003-06-03
Yoo, Do Hyun (Department: 2187)
Electrical computers and digital processing systems: memory
Storage accessing and control
Hierarchical memories
C711S123000, C711S126000, C711S128000, C711S130000, C305S049000, C305S051000, C305S052000, C305S060000
Reexamination Certificate
active
06574711
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates to a semiconductor integrated circuit and in particular to a one-chip large-scale integrated circuit having cache capability.
BACKGROUND OF THE INVENTION
A one-chip large-scale integrated circuit (LSI) containing a 32-bit microcomputer for controlling devices has been developed for built-in applications in the fields of digital and network appliances.
In the following description, the microprocessor part of the LSI will be called “microcomputer core.”
In the field of network appliances, memory protection are becoming more important as the size of programs for implementing computer processing services are becoming larger and the programming environment are changing due to the installation of closed program modules or the installation by downloading program modules.
Therefore a microcomputer core includes a memory management unit (MMU) using a Translation Look-aside Buffer (TLB) as will be described below in order to support the implementation of memory protection capability. In the MMU implementation, the parallel execution of a cache access and a TLB search operation is accomplished in one machine-cycle by optimizing circuitry.
The basic operation of a cache will be described below.
FIG. 5
shows busses of the cache of a microcomputer core. While the cache is divided into an instruction cache
1
and a data cache
2
for processing an instruction access and a data access in parallel, the operations of these caches are the same. The operation of the data cache
2
will be described herein as an example.
The flow of an address signal for a memory access is as follows.
A central processing unit (CPU) core
3
accesses the data cache
2
through a bus interface (hereinafter called “BCIF”)
4
.
During a read operation, a virtual address output from the CPU core
3
is input into a data TLB
5
through the BCIF
4
.
If a physical address corresponding to the virtual address is in the data TLB
5
, the data TLB
5
outputs the physical address
6
and a hit signal as a hit/miss signal
7
. Otherwise, it outputs a miss signal.
If the hit signal is output from the TLB
5
, the physical address output from the TLB
5
is compared with tags (cache memory indices) in the data cache
2
. If there is a match, the data corresponding to the physical address is output onto a data bus and the data and the hit signal is input into the CPU core
3
through the BCIF
4
. The size of the output from the data cache
2
is 64 bits if it is data, or 32 bits if it is an instruction.
The steps of a write operation are the same until the output of a hit signal from the data cache
2
. After that, instead of outputting data onto the bus, data which has been output from the CPU core
3
onto the bus precedently is written into the data cache
2
.
The cache operation will be detailed below.
FIG. 6
shows a configuration of the data TLB
5
and the data cache
2
.
A virtual address output from an address generator
8
in the CPU core
3
is input into the data TLB
5
through the BCIF
4
.
The virtual address is compared with tags at TAG
5
a
. If there is a physical address corresponding to the virtual address, the high-order address of the physical address and a hit signal is output. Otherwise, a miss signal is output. If the physical address corresponds to protected memory, an exception signal is output and no data is output from the data cache
2
.
On the other hand, because the low-order address of the virtual address is the same as that of the physical address, the low-order address is also input into the data cache
2
at the same time.
The data cache
2
has a TAG memory module
9
and cache data memory module
10
. If there is an address corresponding to the low-order address in the TAG memory module
9
of the data cache
2
, the high-order address of the physical address corresponding to the lower address is output.
If a hit signal is output from the data TLB
5
, the high-order address of the physical address output from the data TLB
5
is compared with the TAG memory module
9
of the data cache
2
at
2
a.
If there is a match, data corresponding to the address is output from the cache data memory module
10
onto the data bus and a hit signal is provided to the CPU core
3
.
If no hit signal is output from the data TLB
5
, or no hit signal is output from the data cache
2
, a miss signal is output to the CPU core
3
.
If an exception signal is output from the data TLB
5
, no data is output from the data cache
2
, instead, exception management is performed by the CPU core
3
.
The steps for a write operation are the same as the steps described above until the output of the hit signal from the data cache
2
. After the hit signal is output, instead of outputting data onto the bus, data which has been output from the CPU core
3
onto the bus precedently is written into the data cache
2
. If an exception signal is output from the data TLB
5
, data is not written into the data cache
2
. Instead, exception management is performed by the CPU core
3
.
In this way, part of the virtual address-physical address translation at the data TLB
5
and part of the match finding in the cache control are performed concurrently in order to increase the speed of cache operations.
Thus the cache operations can be performed within one cycle. Access latency can be reduced by eliminating the accesses to main memory using the cache memory especially when an arithmetic operation which requires memory read/write operations is performed in a number of cycles.
FIG. 7
shows access timing during cache read operation. If a miss signal is output, operation in the cycle halts at that point.
FIG. 8
shows access timing during a cache write operation (when exception management is OK).
FIG. 9
shows access timing during a cache write operation (when exception management is NG).
The operation time is the sum of time required for “TLB TAG comparison”, “TLB data read”, “cache TAG comparison”, “cache hit signal output”, and “cache data output.”
In order to achieve faster operation (reduce the amount of time by one machine cycle, or one cycle clock), the amount of time required for each of these steps should be reduced.
DISCLOSURE OF THE INVENTION
FIG. 10
shows a chip layout of a prior art.
While only a data cache will be illustrated and described below as an example, the same applied to an instruction cache as mentioned earlier.
A TLB bus input
11
connects a TLB TAG module
12
comprising a TLB TAG
12
a
and its I/O
12
b
in a data TLB
5
with the BCIF
4
mentioned earlier. The TLB TAG
12
is memory containing address translation data.
The TLB data memory module
14
of the data TLB
5
comprises a TLB buffer
14
a
and its I/O
14
b
. The TAG memory module
9
of the data cache
2
comprises a cache TAG
9
a
and its I/O
9
b
. The cache TAG
9
a
is memory containing cache indices.
The I/O
14
b
of the TLB module
14
and the I/O
9
b
of the TAG memory module
9
is connected by a TLB bus output line
13
.
A cache data memory module
10
, which is memory containing cache data, comprises cache data memory
10
a
and an I/O
10
b
. A hit signal of the TAG memory module
9
is input in the I/O
10
b
of the cache data memory module
10
from the I/O
9
b
of the TAG memory module
9
.
A cache bus
15
connects to a CPU core
3
through the BCIF
4
and connects to an external bus
17
through a bus control unit (BCU)
16
shown in FIG.
5
.
In the prior-art chip layout, the modules
12
,
14
of the data TLB
5
and the modules
9
,
10
of the data cache
2
are designed as separated modules and the wiring between the modules is provided subsequently, entailing a long line length.
Generally, a wiring delay is expressed by 0.4 * R * C (where, R is wire resistance and C is wire capacitance) and a longer line length will provide larger R and C.
The propagation delay time between the TLB and the cache TAG, or the propagation delay time through the “data read” line length to the cache TAG to the cache data memory module, that is, the “cache hit signal output” line
Peugh Brian R.
Yoo Do Hyun
LandOfFree
Semiconductor integrated circuit does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Semiconductor integrated circuit, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Semiconductor integrated circuit will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3109755