Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories
Reexamination Certificate
1998-05-18
2001-09-11
Kim, Matthew (Department: 2186)
Electrical computers and digital processing systems: memory
Storage accessing and control
Hierarchical memories
C711S118000
Reexamination Certificate
active
06289417
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to the field of data processing. More particularly, this invention relates to data processing systems that include a register bank having a plurality of registers that store operand values to be used by an execution unit in performing a data manipulation under program instruction control.
2. Description of the Prior Art
It is known to provide data processing systems having a register bank made up of a plurality of registers each storing data values to be manipulated or to control manipulation. The operand values within these registers are supplied to an execution unit via read ports of the register bank with the result being written back via one or more write ports to the register bank. Considerations of circuit layout and area vs cost and/or speed mean that it is desirable to provide a limited number of read ports from the register bank that can provide simultaneous read access to different registers within the register bank. Certain program instructions may require more input operands to be read from the register bank than there are available read ports to read those operands within a single read cycle. In this case, instruction execution is slowed by the need to provide multiple read cycles of the register bank.
An example of a data processing system of the above type is the ARM7 microprocessor produced by Advanced RISC Machines Limited of Cambridge, England. This microprocessor has a register bank having two read ports. However, some program instructions require three input operands to be read from the registers of the resister bank. An example of such a program instruction is one that multiplies the contents of two registers together with one of those input operands to be multiplied being first shifted by an amount that is specified as a data value stored within another register of the register bank. In this case, two read cycles are required to recover the three necessary input operands from the register bank via the two read ports.
SUMMARY OF THE INVENTION
Viewed from one aspect the present invention provides an apparatus for processing data, said apparatus comprising: a register bank having a plurality of registers for storing data values to be manipulated, said register bank having X read ports for allowing simultaneous reading of data values from X registers during a read cycle;
an execution unit for performing at least one data processing operation, specified by a data processing instruction, that requires Y input operands from Y registers, where said data processing instruction specifies which Y registers are to supply said Y input operands;
one or more cache registers for storing as cached data values copies of data values stored within one or more corresponding ones of said plurality of registers of said register bank; and
a cache register controller coupled to said one or more cache registers for detecting if said data processing instruction specifies within said Y input operands any data values stored in a register of said register bank of which a copy is stored in said one or more cache registers as a cached data value, and, if so, supplying said cached data value to said execution unit as said one of said Y input operands rather than reading said data value from said register bank, whereby the number of data values that require reading from said register bank via said X read ports is reduced.
The invention recognises that in many cases one of the data values being read from a register as an input operand is re-used over many subsequent program instructions (not necessarily immediately following) and accordingly a significant increase in performance can be achieved by caching one of the input operands outside of the register bank so as to reduce the number of data values that need to be read through the read ports of the register bank. As an example of the type of data processing in which this may be highly useful is image data manipulations upon large blocks of pixel data for which it is common to provide the same shift value specified within a register to many hundreds or thousands of pixel values that are being manipulated. If this shift value can be cached outside of the register bank for supply to a shifter, then the number of read cycles for each instruction can be reduced resulting in a large increase in performance. This increase in performance is sufficient to justify the additional overhead of the hardware required to provide and control this register caching capability. Avoiding a repeated register read may also save power. It will be appreciated that the way in which a data processing instruction specifies a register to be used may be explicit (e.g. a field within the instruction) or implicit (e.g. a stack push instruction that always uses a fixed register to provide the stack pointer).
The invention reduces the demands made upon the read ports of the register bank. The read port may not all be quick or have appropriate timing requirements and so it can be advantageous to avoid the need to have to use slow or difficult to time read ports through use of the invention. In the case where Y>X, the invention is especially useful as in this case there are insufficient read ports available to allow for all the operands to be read from the register bank together and so having some of these values cached outside the register bank can reduce the bottleneck.
Whilst the reduction of the number of reads that need to be made through the read ports of the register bank is desirable in general, it is particularly advantageous in embodiments in which Y=X+1 such that, when a cached data value from said one or more cache registers is used rather than reading a corresponding data value from said register bank, the number of data values that require reading from said register bank is reduced to X data values that may all be read in a single read cycle.
In the above circumstance, the provision of the cache registers of the present invention serves to reduce the number of read cycles required from two to one by taking the number of operands that need to be recovered from the register bank from just above the number of read ports available to a number equal to the number of read ports that are available to read those data values in a single cycle.
The cached operand value may be used as an input to many different functional units that form part of the execution unit, e.g. a multiplier or an ALU. However, the invention is particularly advantageous in embodiments in which said execution unit includes a shifter for shifting one of said Y input operands by a shift amount specified by another of said Y input operands, said one or more cache registers operating to cache said shift amounts.
As mentioned above, processing operations in which the same shift value (or a small set of shift values) is applied to a very large number of successive data processing operations are relatively common and performance critical and so the present invention is particularly useful in these circumstances.
The potential complications of introducing the cache register controller into what might be a critical path within the operation of the system can be reduced in embodiments in which there is provided an instruction pipeline along stages of which data processing instructions are passed before reaching an execution stage at which said data processing instructions are executed by said execution unit, said cache register controller being responsive to said data processing instructions within a stage of said instruction pipeline before said execution stage said such that said detection by said cache register controller for a given data processing instruction can be commenced prior to said given data processing instruction reaching said execution stage.
Within a pipelined system, the cache register controller can read the data indicating which registers are specified as supplying the input operands before the relevant data processing instruction reaches the execution stage of the pipeline so providing sufficie
ARM Limited
Chace Christian P.
Kim Matthew
Nixon & Vanderhye P.C.
LandOfFree
Operand supply to an execution unit does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Operand supply to an execution unit, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Operand supply to an execution unit will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2540589