Electrical computers and digital processing systems: processing – Instruction fetching – Prefetching
Reexamination Certificate
1999-12-10
2003-05-06
Chaki, Kakali (Department: 2124)
Electrical computers and digital processing systems: processing
Instruction fetching
Prefetching
C712S205000, C712S237000, C712S239000, C712S240000, C717S127000, C717S131000
Reexamination Certificate
active
06560693
ABSTRACT:
STATEMENT REGARDING FEDERALLY-SPONSORED RESEARCH OR DEVELOPMENT
Not Applicable.
INCORPORATION BY REFERENCE OF MATERIAL SUBMITTED ON A COMPACT DISC
Not Applicable.
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention describes a way to conditionally prefetch instruction or data from the memory. In particular, a method and apparatus are disclosed for improving the performance of the cache by using branch prediction information to selectively issue prefetches.
2. Description of Related Art
The current state of computer system technologies is such that processor speeds are increasing at a more rapid rate than main memory speeds. This mismatch between processor speed and main memory speed is being masked by including larger and larger random access “buffers” or “caches” between the processor and main memory.
Data is typically moved within the memory hierarchy of a computer system. At the top of this hierarchy is the processor and at the bottom are the I/O storage devices. The processor is connected to one or more caches (random access buffers). One type of cache is an instruction cache for supplying instructions to the processor with minimal delay. Another type of cache is a high-speed buffer for holding data that is likely to be used in the near future.
Either type of cache can be connected either to other caches or to the main memory of the memory hierarchy. When a program is executed, the processor fetches and executes instructions of the program from the main memory (or an instruction cache). This can cause the processor to request a cache entry or modify or overwrite a cache entry and portions of the main memory.
An illustrative data processing system
100
in accordance with the prior art is shown in FIG.
1
. The data processing system
100
has a cache which may consist of only a single cache unit or multiple cache units. The cache may be separated into a data cache
145
and an instruction cache
110
so that both instructions and data may be simultaneously provided to the data processing system
100
with minimal delay. The data processing system
100
further includes a main memory
150
in which data and instructions are stored, a memory system interface
105
which allows the instruction cache
110
and data cache
145
to communicate with main memory
150
, an instruction fetch unit
115
for retrieving instructions of an executing program. Further included in the data processing system is a decode and dispatch unit
120
for interpreting instructions retrieved by the instruction fetch unit
115
and communicating the interpreted information to one or more execution units, and a plurality of execution units including a branch unit
125
, functional unit
130
and memory unit
140
, for using the interpreted information to carry out the instruction. The branch unit
125
is responsible for executing program branches, that is computing modifications to the program counter as a program is executed. The generic functional unit
130
represents one or more execution units that can perform operations such as addition, subtraction, multiplication, division, shifting and floating point operations with various types of data as required. Typically, a processor will have several execution units to improve performance. In this description all branches are sent to the branch unit
125
. All other instructions go to the general functional unit
130
. This configuration is chosen for simplicity and to present an explicit design. Clearly, many other execution unit configurations are used with general or special purpose computing devices. Associated with each execution unit is an execution queue (not shown). The execution queue holds decoded instructions that await execution. The memory unit
140
is responsible for computing memory addresses specified by a decoded instruction. A register file
135
is also included in the data processing system
100
for temporarily holding data. Of course, other storage structures may be used instead of or in addition to the register file
135
, such as those used for dealing with speculative execution and implementation of precise interrupts. A sample register file
135
is described as being illustrative of the storage structures which may be used.
When a program is executed, a program counter or sequence prediction mechanism communicates an instruction address to the instruction fetch
115
. The instruction fetch
115
, in turn, communicates the instruction address to the instruction cache
110
. If the instruction corresponding to the instruction address is already in the instruction cache
110
, the instruction cache returns the instruction to the instruction fetch
115
. If not, the instruction cache
110
transmits the instruction address to the memory system interface
105
. The memory system interface
105
locates the instruction address in main memory
150
, and retrieves the instruction stored at that address. The instruction is then delivered to the instruction cache
110
, from which it is finally returned to the instruction fetch
115
. When the instruction arrives at the instruction fetch
115
, it is delivered to the decode and dispatch unit
120
if there is available buffer space within the decode and dispatch unit
120
for holding the instruction. The decode and dispatch unit
120
then decodes information from the delivered instruction, and proceeds to determine if each instruction and associated decoded information can be placed in the execution queue of one of the execution units. The appropriate execution unit receives the instruction and any decoded information from the decode and dispatch unit
120
, and then uses the decoded information to access data values in the register file
135
to execute the instruction. After the instruction is executed, the results are written to the register file
135
.
In addition to its general function of computing memory addresses, the memory unit
140
is responsible for executing two particular kinds of instructions: load and store.
A load instruction is a request that particular data be retrieved and stored in the register file
135
. The memory unit
140
executes a load instruction by sending a request to the data cache
145
for particular data. If the data is in the data cache
145
and is valid, the data cache returns the data to the memory unit. If the data is not in the data cache
145
or is invalid, the data cache
145
accesses a particular data memory address in main memory
150
, as indicated by the load instruction, through the memory system interface
105
. The data is returned from main memory
150
to the data cache
145
, from which it is eventually returned to the memory unit
140
. The memory unit
140
stores the data in the register file
135
and possibly passes it to other functional units
130
or to the branch unit
125
. A store instruction is a request that data be written to a particular memory address in main memory
150
. For stores, a request is sent by the memory unit
140
to the data cache
145
specifying a data memory address and particular data to write to that data memory address. If the data corresponding to the specified data memory address is located in the data cache
145
and has the appropriate access permissions, that data will be overwritten with the particular data specified by the memory unit
140
. The data cache
145
then accesses the specified memory address in main memory
150
, through the memory system interface
105
, and writes the data to that address.
Focusing on the cache, which may be a data cache
145
, an instruction cache
110
, or a combined cache, the cache is repeatedly queried for the presence or absence of data during the execution of a program. Specifically, the data cache
145
is queried by the memory unit
140
regardless of whether the memory unit
140
executes a load or store instruction. Similarly, the instruction cache
110
is repeatedly queried by the instruction fetch
115
for a particular instruction.
A cache has many “blocks” which individually store the various instructions and da
Charney Mark
Hartstein Allan M.
Oden Peter H.
Prener Daniel A.
Puzak Thomas R.
August Casey P.
Buchenhorner Michael J.
Chaki Kakali
International Business Machines - Corporation
Wood William H
LandOfFree
Branch history guided instruction/data prefetching does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Branch history guided instruction/data prefetching, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Branch history guided instruction/data prefetching will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3070041