Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories
Reexamination Certificate
2002-05-21
2004-06-01
Nguyen, Hiep T. (Department: 2186)
Electrical computers and digital processing systems: memory
Storage accessing and control
Hierarchical memories
C711S128000, C711S132000, C711S154000, C711S170000
Reexamination Certificate
active
06745288
ABSTRACT:
FIELD OF THE INVENTION
The present invention is concerned with computer programs, and is more particularly concerned with computer programs that support multiple concurrent control threads.
BACKGROUND OF THE INVENTION
Many conventional operating systems support multiple concurrent control threads to allow multitasking of application programs. In some cases, duplicate control threads are operated concurrently, as occurs, for example, when plural input/output devices of the same type are independently controlled by respective control threads.
It is also conventional that each control thread be provided with a call stack. As is very familiar to those who are skilled in the art, a call stack is a collection of data structures called stack frames in which a previous stack frame pointer, return instruction pointer and relevant data are stored upon initialization of a call routine. The next instruction pointer and the relevant data are retrieved from the previous stack frame on the call stack at the end of the call routine.
The present invention addresses a problem that may be encountered in connection with caching of call stacks for multiple duplicate control threads.
FIG. 1
is a simplified block diagram of a conventional computer system in which the present invention may be applied. Reference numeral
100
generally indicates the computer system, from which many customary components have been omitted to simplify the drawing. The computer system
100
includes a processor (CPU)
110
, which may, for example, be one of the PowerPC™ family of devices available from International Business Machines. A cache memory
120
is associated with the processor
110
. For example, the cache memory
120
may be on board the processor
110
. Also accessible by the processor
110
are a non-volatile memory (e.g., ROM)
130
and a volatile memory (e.g., RAM)
140
. The non-volatile memory
130
and the volatile memory
140
are connected to the processor
110
via a memory controller
150
, memory busses
160
and a processor bus
170
.
When the processor
110
performs a load or store instruction, the cache memory
120
is interrogated to determine if the needed data resides within the cache memory
120
. If so, then the processor
110
either loads the data from or modifies the data within the cache memory
120
. If the data does not reside within the cache memory
120
, then the processor
110
issues a cache line fill operation to the memory controller
150
. The memory controller
150
retrieves the data from the appropriate source, which may be either the non-volatile memory
130
or the volatile memory
140
. The memory controller
150
then returns the data to the processor
110
. If there is a usable empty line in the cache memory
120
, then the data received via the memory controller
150
is placed in the empty line. If not, then a line of data is evicted from the cache memory
120
, and the new data is put in the place of the evicted data. The processor
110
then loads the data from or modifies the requested data within the cache memory
120
.
Because the cache memory
120
is on board or otherwise closely associated with the processor
110
, accessing data in the cache memory
120
is much more rapid than accessing data in the non-volatile memory
130
or the volatile memory
140
. Thus the use of the cache memory
120
may improve the performance of the computer system
100
.
A cache memory or “cache” is customarily defined by its number of ways (described below), line size and total size. The number of blocks that a cache memory is divided into is equal to the total size of the cache memory divided by the line size.
When the data required by the processor
110
is not present in the cache memory
120
(which is sometimes referred to as a “cache miss”), the necessary data, in a length equal to the cache line size, is brought into the cache memory
120
from external memory such as the non-volatile memory
130
or the volatile memory
140
referred to above. The line of data is placed in one of the blocks of the cache memory
120
. The “associativity” of the cache determines which blocks the data can be placed in. In a “fully associative” cache, the data can be placed in any of the blocks. In a “direct mapped” cache, the data can be placed in only one block, which is indicated by the least significant bits of the memory address from which the data was obtained.
In an “n-way set associative” cache, the memory address from which the data was obtained maps to one “set” of the cache. A set contains a number of blocks that is equal to n. The number of sets in a cache is determined by dividing the number of blocks in the cache by n. (A direct mapped cache can be thought of as a one-way set associative cache. A fully associative cache with m blocks can be thought of as an m-way set associative cache. Alternatively, a direct mapped cache can be thought of as having m sets, assuming m blocks in the cache, and a fully associative cache can be thought of as having one set.)
It is common to provide cache memories that are 2-way or 4-way set associative.
The particular set in which data will be placed in an n-way set associative cache is determined based on the memory block number (data address divided by cache line size) modulo the number of sets in the cache. The particular block within the set that is used to store the data may be determined, for example, using a Least Recently Used (LRU) algorithm.
The “span” of a cache is determined by dividing the total size of the cache by the number of ways. The span determines the range of data addresses that can be used before data addresses start to be placed within the same set. For example, assume that a cache has a span of 1K bytes and a cache line size of 32 bytes so that the cache has 32 sets. Data from a main memory data address of 0x400 is placed within set
0
. Data from a main memory data address of 0x500 is placed in set
8
, and data from a main memory data address of 0x800 is placed again within set
0
. The number of different memory addresses that can be serviced in one set before a conflict occurs is determined by the number of ways of the cache memory.
FIGS. 2A-D
are schematic illustrations of mapping of main memory addresses to different types of cache memories, in accordance with conventional practices. In
FIGS. 2A-D
, reference numeral
210
(
FIG. 2A
) indicates a simplified representation of main memory (such as the non-volatile and/or volatile memory
130
,
140
), reference numeral
230
(
FIG. 2B
) indicates a direct mapped cache, reference numeral
240
(
FIG. 2C
) indicates a fully associative cache, and reference numeral
250
(
FIG. 2D
) indicates a 2-way set associative cache. It is assumed that block
220
in memory
210
is to be cached. Block
220
has a block address of
13
. The direct mapped cache
230
(
FIG. 2B
) has eight blocks or sets (indicated by reference numeral
235
). Since 13 modulo 8 equals 5, the data from block
220
(block address
13
) of the main memory
210
would be placed in block number
5
of the direct mapped cache
230
, as indicated by reference numeral
280
.
In the case of the fully associative cache
240
(FIG.
2
C), there are eight blocks (reference numeral
245
) corresponding to eight ways or one set. The data from block
220
of the main memory
210
can be placed in any of the eight blocks of the fully associative cache
240
, as indicated by reference numeral
283
.
In the case of the 2-way set associative cache
250
(FIG.
2
D), there are four sets (reference numeral
255
) of two blocks each. Since 13 modulo 4 equals 1, the data from block
220
(block address
13
) of the main memory
210
(
FIG. 2A
) may be stored in either of the two blocks (blocks
2
and
3
) of set
1
(reference numeral
260
), as indicated by reference numeral
286
(FIG.
2
D).
FIG. 3
is a schematic illustration that illustrates a problem identified by the present inventor that may be encountered in connection with caching of call stacks for duplicate control threads. In the example shown in
FIG. 3
Dugan & Dugan
International Business Machines - Corporation
Nguyen Hiep T.
LandOfFree
Staggering call stack offsets for multiple duplicate control... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Staggering call stack offsets for multiple duplicate control..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Staggering call stack offsets for multiple duplicate control... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3299446