Microprocessor with an instruction level reconfigurable...

Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S120000, C711S128000, C711S170000, C711S173000

Reexamination Certificate

active

06223255

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of Invention
This invention relates to a microprocessor having a reconfigurable n-way cache to provide increased bandwidth for signal processing as well as general purpose applications.
2. Description of Related Art
There is a fundamental difference in the way microprocessor and digital signal processors (DSP) are designed and used in system realization. Whereas microprocessors are designed to execute general purpose applications as efficiently as possible, digital signal processors (DSPs) are designed to execute only specific applications (such as speech processing) as efficiently as possible. Systems based on microprocessors are designed to run any general application. Some of these applications may not be run on the system until years after the system was shipped. On the other hand, systems based on a DSP are designed to run, in general, only a small set of specific applications, e.g., a telephone answering machine runs only a specific application throughout its lifetime. Once a system based on a DSP is shipped, typically, no new applications are run on it.
Due to this difference in the way microprocessors and DSPs are used, the design styles for these two types of processors have evolved quite differently. However, both processors are designed to provide high performance cost effectively.
Many conventional processors have multi-ported register files, and are therefore capable of providing two or more operands contained in registers to the execution unit (EU) every cycle. The register files are contained on the same integrated circuit as the arithmetic logic unit (ALU), and are very fast devices for providing the desired data. For example, referring to
FIG. 1
, a typical prior-art microprocessor
100
includes an instruction register
101
that supplies a first address (ADDR
0
) to a first register file
102
, and a second address (ADDR
1
) to a second register file
103
. The register files
102
and
103
illustratively have 32 entries of 32 bits each. The first register file
102
supplies a first operand to a first operand register
104
. The second register file
103
supplies a second operand to a second operand register
105
. The registers
104
and
105
supply the first and second operands to the arithmetic logic unit (ALU)
106
, which may perform various arithmetic operations, illustratively including a multiply-accumulate (MAC) operation. The result is stored in the result register
107
, and may be written back into the register files via a signal line
108
. In an alternate embodiment, a single dual-ported register file (not shown)is used in lieu of the two register files
102
and
103
. In that case, two read ports allow simultaneous access to any two entries in the register file.
Although a register file provides efficient temporary storage, memory organization plays a critical role in determining the performance of microprocessors and DSPs. This is because the performance is determined by how efficiently instructions and data are accessed from the memory. Since speed of discrete memories has not kept pace with the processor speeds, typically, on-chip storage is provided for both instructions and data. Microprocessors and DSPs differ in the way in which this on-chip memory is organized.
There are many instances where it is necessary to supply two operands, contained in memory, that are not already in the on-chip registers. An example is a multiply-accumulated instruction which is one of the basis primitives of signal processing. A typical instruction is
MAC x, y, a
0
where MAC is the mnemonic for the instruction “multiply accumulate” and the operation specified is:
a
0=
a
0+(
x*y
)
Typically, x and y belong to specific arrays in the memory. For example, x may be located in a coefficient array and y may be located in a data array.
The two memory operands x and y are typically contained in an on-chip data memory, if available, or in a memory external to the microprocessor chip. In either case, supplying two operands to the ALU every cycle implies dual-porting the data memory.
FIG. 2
shows an example of a DSP
200
having two banks of on-chip memory. An instruction register
201
supplies first and second addresses (ADDR
0
, ADDR
1
) to a first bank
202
and a second bank
203
of the RAM, where each bank
202
and
203
is illustratively 1 kilobyte in size. The data is written to the RAM via a write line
213
. The first operand is read from the bank
202
and output to a multiplexer
204
. Similarly, the second operand is read from the second bank
203
and output to a multiplexer
206
. Assuming the multiplexers
204
and
206
select the outputs of the RAM banks
202
and
203
, the first operand is then latched into a first operand register
205
, while the second operand is then latched into a second operand register
207
. Alternatively, the operands may be selected by the multiplexers
204
and
206
from an external memory bus
212
.
The operands are then provided from the operand registers
205
and
207
to the ALU/MAC unit
208
, where they are multiplied together and added to the previous result accessed from an accumulator file
210
via a second line
214
. The result is provided to the result register
209
and stored in the accumulator file
210
.
Although this technique provides for the multiply/accumulate function within a conventional DSP architecture, there are disadvantages of this approach. For example, since the on-chip memory is configured as RAM rather than as a cache memory, only selected applications can utilize it. All the data addresses in the memory have to be determined when the application program is developed. Thus, conventional microprocessor applications cannot make flexible use of this memory. Furthermore, it is difficult to run applications from different vendors that are installed in the field.
Since any application may be run on a microprocessor-based system, its characteristics are not known in advance. On-chip caches are conventionally used in microprocessors to improve performance. The cache works based on temporal locality and spatial locality. Temporal locality means that once a given memory location is used, it is likely that it may be used in the near future. Spatial locality means that once a memory location is used, it is likely that locations in the vicinity of that location may be used in the near future.
FIG. 3
shows a schematic diagram of a 2-way set-associative cache and how it is addressed, as described in
Computer Architecture: A Quantitative Approach
, J. L. Hennessy and D. A. Patterson, Morgan Kaufmann Publishers, Inc. pp. 408-414, 1990 (
Computer Architecture
) . The cache includes data portions
305
and
306
and tag portions
307
and
308
. The cache has n blocks or lines. A block typically includes more than one byte of storage. A byte within a block is addressed by the block offset field
304
of the address
301
. For example, if the block size is 8 bytes, block offset field is 3 bits. The index field
303
of the address
301
is used to select the set in the cache. Each set in a 2-way associative cache has two blocks. The block frame address
302
is stored in the tag portion associated with the data portion where the block is stored. When a cache block is first written, a set is specified by the index
303
portion of the address. The block within the set is determined by a selection algorithm, such as, random replacement or least recently used (LRU). Once a block is selected, the block frame address
302
is written in the tag portion
307
or
308
and the block from memory is written in the data portion
305
or
306
corresponding to the selected block. A special bit is provided in the tag portions
307
and
308
to indicate that a given entry in the cache contains valid data. In general, there are other control bits in the tag portions
307
and
308
to store other information, such as privilege level, etc.
At a later time, the processor may request data at a specified memory address
301
. In order to check whether a specific dat

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Microprocessor with an instruction level reconfigurable... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Microprocessor with an instruction level reconfigurable..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Microprocessor with an instruction level reconfigurable... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2520862

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.