Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories
Reexamination Certificate
2001-04-24
2003-03-25
Nguyen, T. V. (Department: 2187)
Electrical computers and digital processing systems: memory
Storage accessing and control
Hierarchical memories
C711S117000, C711S118000, C711S119000, C711S125000, C711S128000, C711S133000, C711S140000
Reexamination Certificate
active
06539458
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates to a data processing system and method involving a data requesting element and a memory element from which said data requesting element requests data. An example of such a system is a processor and a first level cache memory, or two memories arranged in a hierarchy.
BACKGROUND OF THE INVENTION AND PRIOR ART
The concept of a hierarchical memory structure is known in the art. The term “hierarchical” implies that instead of having a single memory from which data is requested, a hierarchy of levels is used, where data is first requested by e.g. a processor from a first level memory, and if the requested data is present in the first level memory (which is also referred to as a “hit”), it is provided to the processor. If not (which is also referred to as a “miss”), a request is given to a second level memory provided below the first level memory in the hierarchy. If the data is present in the second level memory, then it is provided to the processor from there, and possibly also stored in the first level memory. A third level may be provided below the second level, and further levels below that. An example of such a structure is processor using a memory structure having a first and second level cache, below that a main memory, and below that a disk memory.
The memories are organized in such a way that higher level memories tend to be smaller and faster (in terms of access) than lower level memories. The advantages of such a structure will be explained further on.
In more detail, as shown schematically in
FIG. 12
, a conventional data processing arrangement with a hierarchical memory typically comprises a processor or CPU (central processing unit)
10
that contains a program counter
11
containing instruction addresses to be performed, said program counter being controlled by a control unit
12
. A computational element
13
or ALU (arithmetic logic unit) performs operations on data held in registers
14
under the control of the control unit
12
in accordance with the instructions indicated by the addresses from the program counter. A main memory
30
is provided for storing program data under the corresponding instruction addresses. The main memory
30
is a RAM type memory that will typically be connected to a slow memory with large volume, such as a hard disk drive
40
. A cache memory
20
is arranged as an intermediate memory between the main memory
30
and the CPU
10
for storing part of the program data under the corresponding instruction addresses.
The instruction execution performed by the processor is typically pipelined, which means that the multiple steps of successive instructions are performed in overlap. In other words, each instruction is broken down into a predetermined number of basic steps (e.g. fetch, decode, operate and write), and a separate hardware unit is provided for performing each of these steps. Then these steps can be performed in overlap for consecutive instructions during one cycle, e.g. while the write step is being performed for a first instruction, simultaneously the operate step is performed for a second instruction, the decode step is performed for a third instruction and the fetch step is performed for a fourth instruction. This is well known in the art and need not be explained further here.
A memory hierarchy using a cache in addition to the main memory takes advantage of locality and cost/performance of memory technologies. The principle of locality says that most programs do not access all code or data uniformly. This principle, plus the guideline that smaller hardware is faster, leads to the hierarchy based on memories of different speeds and sizes. Since fast memory is expensive, a memory hierarchy is organized into several levels, each smaller, faster, and more expensive per byte than the next level. The goal is to provide a memory system with cost almost a low as the cheapest level of memory and speed almost as fast as the fastest level. The levels of the hierarchy usually subset one another; all data in one level is also found in the level below, and all data in that lower level is found in the one below it, and so on until the bottom of the hierarchy is reached. Normally, each level maps addresses from a larger memory to a smaller but faster memory higher in the hierarchy. Present terminology calls high-level memories cache memories. It is known to provide a plurality of cache levels.
For example, as can be seen in
FIG. 12
, the cache memory
20
stands higher in the hierarchy than main memory
30
, and main memory
30
stands higher in the hierarchy than disk drive
40
. When the CPU
10
requests data, it first requests the data from the cache
20
. In the event of a miss, the data must be fetched from the main memory
30
, and if again a miss occurs, it must be fetched from the disk drive
40
. Typically, the CPU will output virtual addresses, i.e. addresses that define a virtual address space, whereas the data will be stored at physical addresses. The actual reading out of data from one of the memories therefore usually requires an address translation from virtual to physical.
Data is read into each of the memories in specific data units. In the case of the main memory
30
such a data unit is called a page, in the case of the cache memory
20
it is called a line or block. Each page or line consists of a number of data words. The CPU
10
can read data out of cache
20
in any desired way, be it in units of lines or in units of words.
Data in a cache memory are organized by directories which are called address tags. Usually, a group of data is associated with one tag. For example, data associated with tag
0123
X might have addresses
01230
through
01237
. This group of data e.g. forms the above mentioned cache line. Usually, a cache directory behaves associatively, that is, the cache directory retrieves information by key rather than by address. To determine if a candidate address is in the cache, the directory compares the candidate address with all addresses now in the cache. To maintain high speed, this operation must be done as quickly as possible, which should be within one machine cycle. Furthermore, a cache memory is called set associative if the cache is partitioned into distinct sets of lines, each set containing a small fixed number of lines. In this scheme, each address reference is mapped to a particular set by means of a simple operation on the address. If the address is in the cache, then it is stored as one of the lines in the set. Therefore, the cache need not be searched in its entirety. Only the set to which the address is mapped needs to be searched. If a match is found, then the corresponding data line of the cache is gated to the cache output-data buffer, and from there it is transmitted to the computational unit. In summary, there are three parameters for characterizing a cache, namely the number of bytes per line, the number of lines per set and the number of sets. A cache in which the directory search covers all lines in the cache is said to be fully associative, which corresponds to the case when the number of sets is 1.
In the cache memory some active portion of the low-speed main memory is stored in duplicate. When a memory request is generated, the request is first presented to the cache memory, and if the cache cannot respond, the request is then presented to main memory. If an item is not resident in the cache but in the main memory, this constitutes the above mentioned cache miss. Assuming e.g. that a tag
0124
X is not present, then a reference to address
01243
produces a miss for the cache since no tag matches this address. The item is then retrieved from main memory and copied into the cache. During the short period available before the main-memory operation is complete, some other item in cache is removed from the cache to make room for the new item. Special replacement algorithms deal with the cache-replacement decision. A well known strategy is the LRU (least recently used). According to the LRU replacement algorithm a cache line
Nguyen T. V.
Nixon & Vanderhye PC
Telefonaktiebolaget LM Ericsson (publ)
LandOfFree
Hierarchical memory for efficient data exchange control does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Hierarchical memory for efficient data exchange control, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Hierarchical memory for efficient data exchange control will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3040455