System and method for cache process

Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S163000, C711S202000

Reexamination Certificate

active

06385697

ABSTRACT:

BACKGROUND OF THE INVENTION
The present invention relates to a cache system and a cache processing method to be applied to a data processing unit such as an MPU (MicroProcessor Unit), a CPU (Central Processing Unit), etc. for reducing access time to external memory such as main memory, and in particular, to a cache system and a cache processing method in which full-set associative cash memory capable of preserving cache lines of high usage frequencies is coupled with non-full-set associative cash memory (direct mapping cache memory, 2-way set associative cache memory, 4-way set associative cache memory, 8-way set associative cache memory, etc.) and thereby the cache hit rate is increased.
DESCRIPTION OF THE PRIOR ART
Cache or cache memory is widely used for reducing access time of an MPU, a CPU, etc. of high data processing speed to memory of low data processing speed such as main memory. Especially, hierarchical cache memory composed of a primary cache memory and a secondary cache memory is widely used in order to increase program execution speed of the MPU, CPU, etc. and thereby improve throughput of systems including the MPU, CPU, etc. Generally, cache memory is provided with a cache table for storing a plurality of tags and data corresponding to the tags. A tag extracted from input address data is compared with tags which have been stored in the cache table, and if the extracted tag matched one of the stored tags, data corresponding to the matched tag is selected out and outputted from the cache table to the MPU, CPU, etc. Thereby, the number of access to external memory (such as main memory of low data processing speed) is reduced, and thereby high speed data processing of the MPU, CPU, etc. is realized.
As the cache memory, full-set associative cash memory and direct mapping cache memory are well known. The direct mapping cache memory can implement high speed access with a small circuit scale, however, its cache hit rate is easily deteriorated. On the other hand, the full-set associative cash memory requires more complicated circuit composition and larger power consumption, however, preservation of high-hit-rate cache lines is possible in the full-set associative cash memory. Incidentally, 2-way set associative cache memory, 4-way set associative cache memory and 8-way set associative cache memory are also well known as non-full-set associative cache memories having similar functions to the direct mapping cache memory.
FIG. 1
is a schematic block diagram showing typical conventional direct mapping cache memory. The direct mapping cache memory shown in
FIG. 1
includes a cache table
801
for storing a plurality of tags and data corresponding to the tags, a comparator
802
, an AND gate
803
and a data selector
804
. Each cache line of the cache table
801
is also provided with a “valid bit” for indicating whether the cache line is valid or invalid. A valid bit “
1
” indicates that the cache line is valid, and a valid bit “
0
” indicates that the cache line is invalid. At the top of
FIG. 1
, an example of an input address data “00000040” (hex) is shown. The input address data includes a tag, an index and an offset. In the case of the input address data “00000040”, the tag is “00000” (the front 5 hexadecimal digits of the input address data, for example), the index is “04” (the next 2 hexadecimal digits of the input address data, for example), and the offset is “0” (the last 1 hexadecimal digit of the input address data, for example).
FIG. 2
is a schematic diagram showing an example of a program which is executed by a CPU. Referring to
FIG. 2
, the program includes instructions (1), (2), . . . to be executed. The main memory preliminarily stores a plurality of instructions in its corresponding addresses. When the program of
FIG. 2
is executed by the CPU, the CPU first refers to input address data (which is supplied from a program counter) with regard to the first instruction (1) of the program. The input address data “00000040” with regard to the first instruction (1) indicates that the first instruction (1) has preliminarily been stored in an address “00000040” of the main memory. With regard to the first instruction (1), the CPU (concretely, the comparator
802
of the direct mapping cache memory shown in
FIG. 1
) judges whether the tag “00000” extracted from the input address data “00000040” matches a tag which have been stored in a cache line (of the cache table
801
shown in
FIG. 1.
) corresponding to the index “04”. If matched, data corresponding to the matched tag is read out from the cache table
801
of the direct mapping cache memory and sent to the CPU (instruction decoder). If not matched, the CPU makes access to the main memory and fetches the first instruction (1) from the address “00000040” of the main memory. In this case (not matched), the cache line corresponding to the index “04” is rewritten, that is, the data of the cache line corresponding to the index “04” is changed into the data fetched from the main memory and the tag of the cache line is changed into the tag “00000” corresponding to the input address data “00000040”. Thereafter, the same processes are executed for the subsequent instructions (2), (3), . . . . By use of the direct mapping cache memory, the number of access of the CPU to the main memory (needing long access time) is reduced, and thereby high speed program execution by the CPU (cache process) is realized.
FIGS. 3A and 3B
are schematic diagrams showing examples of change of statuses of the cache table
801
of the direct mapping cache memory of
FIG. 1
when the program of
FIG. 2
is executed by the CPU, in which
FIG. 3A
shows the status of the cache table
801
just after the execution of the instruction (1) of
FIG. 2
by the CPU and
FIG. 3B
shows the status of the cache table
801
just after the execution of the instruction (5) of
FIG. 2
by the CPU. Referring to
FIG. 3A
, at the point when the instruction (1) has just been executed, a cache line of the cache table
801
corresponding to an index “04” stores a tag “00000” (corresponding to the input address data “00000040”) and data corresponding to the tag “00000”. Referring to
FIG. 3B
, at the point when the instruction (5) has just been executed, the same cache line of the cache table
801
corresponding to the index “04” stores a tag “00001” (corresponding to the input address data “00001040”) and data corresponding to the tag “00001”.
FIGS. 4A and 4B
are schematic diagrams showing access time of the CPU employing the direct mapping cache memory when the program of
FIG. 2
is executed twice, in which
FIG. 4A
shows a case where the program of
FIG. 2
is executed for the first time and
FIG. 4B
shows a case where the program of
FIG. 2
is executed for the second time. Incidentally, the following explanation concerning program execution time will be given on the assumption that the length of a data storage area of each cache line of the direct mapping cache memory is 4 words (16 bytes) and the length of each instruction of the program of
FIG. 2
is 1 word (4 bytes) (that is, on the assumption that 4 instructions are stored in the data storage area of each cache line of the direct mapping cache memory). Access time necessary for fetching data (instruction) from the main memory is assumed to be 100 ns for the first word, and 30 ns for each of the following 3 words. Therefore, access time necessary for fetching data of 4 words (4 instructions) from the main memory and storing the data in a cache line of the direct mapping cache memory becomes 100+30+30+30=190 ns. If we assume the CPU takes 10 ns for reading and executing an instruction which has just been stored in the cache line of the direct mapping cache memory, access time necessary for fetching first 4 words of data (first 4 instructions) from the CPU, storing the data in the direct mapping cache memory and executing the stored first instruction (1) becomes 190+10=200 ns.
In the initialization, all the cache lines of the direct mapping cache memory are set invali

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and method for cache process does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System and method for cache process, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for cache process will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2823153

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.