Partial match partial output cache for computer arithmetic...

Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S123000, C711S128000, C708S605000

Reexamination Certificate

active

06542963

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a device and method for performing arithmetic operations with cache memory; more particularly a device and method having a cache memory capable of partial output.
2. Discussion of Related Art
Computers perform arithmetic operations such as multiple, divide by manipulating data through repeated shifts, additions, and subtractions. Numerous accesses of data and operational cycles are needed to arrive at the result. For more complex operations such as ‘square root’ and performing trigonometry functions, the traditional computing process is very time consuming and in some applications, it becomes not feasible to implement.
Some of the more complex arithmetic functions can be implemented by look-up tables, but the table must be large enough to accommodate many different operands. If there are differences in the operand to be operated upon, even a difference of a single bit, the look-up table would not work. The data from the look-up table would be erroneous. To obtain a correct result, the entire calculation must be performed for every digit of the operand.
Other implementations include use of dedicated circuitry. See, for example, U.S. Pat. No. 4,734,878 to Sutcliffe, “CIRCUIT FOR PERFORMING SQUARE ROOT FUNCTIONS”. Although less time consuming, dedicated circuitry require extra hardware and can be used only for the specified functions. That is, the Sutcliffe device can only perform square roots, a multiplier can only be used for multiplication, etc.
A cache memory is a high-speed random access memory (RAM) that a (CPU) can access more quickly than it can access regular memory. The cache memory is typically used to store data, which are frequently accessed by the CPU. The more operations a CPU can perform with data from cache memory, the speedier the operations can be performed as compared to operations with data accessed from regular memory.
The cache memory is directly accessed by the CPU (rather than via, for example, an external memory management unit) and is situated physically close to the CPU, some caches being on the CPU chip, to reduce the signal propagation time between the CPU and the cache memory. Descriptions of the operations of data cache such as in an Alpha AXP 21064 microprocessor can be found in “Computer Architecture A Quantitative Approach”, by David A. Patterson and John L. Hennessy, 1996, Morgan Kaufmann Publishers, Inc., pp. 380-383. When the CPU performs a read or a fetch operation, it will first check to see if the data or instructions it requests is in the cache. The data or instruction is accessed quickly if it is in the cache. Otherwise, the data or instructions must be fetched from regular memory (DRAM or ROM) and more time is consumed.
A conventional cache is illustrated in FIG.
1
. Cache memory
10
includes a cache tag
11
, which includes a plurality of buffers for storing data, addresses, or files frequently or recently accessed by the processor. Cache memory
10
includes circuitry for comparing newly input data against the data stored in the tag. As shown in FIG.
1
, an exclusive OR gate is associated with each bit of each buffer in cache tag
11
. When it is determined that the newly input data matches the data stored in one of the buffers in cache tag
11
, a true comparison is output at gates
15
,
16
, . . .
19
. Cache data
12
is a file which stores data corresponding to each buffer in cache tag
11
. Upon receiving a true comparison signal from a corresponding gate
15
,
16
, . . .
19
, the data in the corresponding location of cache data
12
is output as q4, q3, q2 . . . q0. The cache shown in
FIG. 1
is 5 bits wide.
It can be seen that the characteristics of the cache memory may be useful as a data-storing device for an arithmetic device to perform high-speed calculation. For example, the cache memory such as one shown in
FIG. 1
, stores operands in cache tag
11
and corresponding calculating results in cache data
12
. The arithmetic operation is bypassed if an operand on which the same arithmetic operation is to be performed is found in cache tag
11
. In such case, the cache is used as a look-up table, with access time reduced as compared to a look-up table ROM. However, as in the case of a look-up ROM, usage of the cache in a look-up table mode will cause reduction in overall throughput if the approach only works when the operands match exactly with stored cache tag data.
Accordingly, a need exists for a device having a cache memory used in place of an arithmetic unit, which is capable of partial output of calculation results for partially matched operands.
SUMMARY OF THE INVENTION
A method of performing arithmetic operation using a cache structure having a TAG register and a cache data memory, the method comprising the steps of storing in the TAG register a plurality of known operands, each having a corresponding resultant derived from an arithmetic operation, the resultant being stored in a corresponding location in the cache data memory; inputting a new operand on which the arithmetic operation is to be performed to the TAG register; comparing the new operand to the plurality of the known operands; and determining if the new operand matches any of the known operands and if there is a match, outputting from the cache data memory the known resultant corresponding to the known operand that matched the new operand.
The method preferably further including the steps of: if there is no match in the determining step, determining if there are partial matches between the known operands and the new operand; selecting the known operand having the MSBs matching the MSBs of the new operand; accessing from the cache data memory the resultant corresponding to the known operand having the most matched MSBs; determining a first portion of the accessed resultant derived from the matched MSBs of the known operand; masking a second portion of the accessed resultant which is other than the first portion; and providing as a first partial resultant the masked resultant.
The method still further preferably including the steps of performing the arithmetic operation on bits of the new operand other than the matched MSBs to produce a second partial resultant; and outputting a complete resultant by combining the first partial resultant with the second partial resultant.
Advantageously, the illustrative methods according to the present invention are applicable to arithmetic operations such as square root, division, or multiplication.
A cache device is also provided for performing arithmetic operation comprising: a tag register for storing a plurality of known operands, each having a corresponding resultant derived from an arithmetic operation; A cache data memory for storing resultants corresponding to the known operands; Comparison logic in the tag register for comparing a new operand on which the arithmetic operation is to be performed against the plurality of the known operands, wherein if the new operand matches any of the known operands, the cache data memory outputs the known resultant corresponding to the known operand that matched the new operand.
The cache device preferably further including: a logic group detector for detecting a group of data having a common logic level, the logic group detector for detecting the known operand having the highest number of MSBs matching the MSBs of the new operand; masking logic for partially masking the resultant corresponding to the known operand having the most matched MSBs, the masking logic for passing a first portion of the corresponding resultant derived from the matched MSBs of the known operand and for masking a second portion of the corresponding resultant which is other than the first portion; and a multiplexer for multiplexing the first portion of the corresponding resultant into an arithmetic unit.
The cache device according to the present invention, further preferably including a second multiplexer, wherein the arithmetic unit performs the arithmetic operation on bits of the new operand other than the match

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Partial match partial output cache for computer arithmetic... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Partial match partial output cache for computer arithmetic..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Partial match partial output cache for computer arithmetic... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3042008

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.