Electrical computers: arithmetic processing and calculating – Electrical digital calculating computer – Particular function performed
Reexamination Certificate
2000-05-04
2004-02-03
Ngo, Chuong Dinh (Department: 2124)
Electrical computers: arithmetic processing and calculating
Electrical digital calculating computer
Particular function performed
Reexamination Certificate
active
06687724
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to an information processor used to process data, such as a general-purpose processor, central processing unit (CPU), media processor, digital signal processor (DSP) or the like.
2. Description of the Related Art
For use with the multimedia having been spreading, processors to process digital data such as CPU, DSP, etc. have been proposed which have to effect a digital-filter operation more frequently. Since the digital-filter operation is an inner-product operation, it is effected using the following arithmetic expression:
∑
i
=
0
n
⁢
⁢
Ci
×
Xi
(
1
)
For an effective inner-product operation, the recent CPU, DSP, etc. incorporate a multiply and accumulate (MAC) unit. The construction of a CPU incorporating an MAC unit is shown in FIG.
1
.
As shown in
FIG. 1
, the CPU is generally indicated with a reference
100
. The CPU
100
includes a register file
1001
to store a plurality of data, a MAC unit
102
to effect an inner-product operation of the data, shift (SHIFT) unit
103
to shift the data to the right and left, and an arithmetic logic (ALU) unit
104
to effect arithmetic and logical operations of the data. For an inner-product operation by the CPU
100
, the data stored in the register file
101
are multiplied and accumulated by the MAC unit
102
, and the result of the multiplication and accumulation is stored again into the register file
101
. Then, the data stored in the register file
101
is repeatedly multiplied and accumulated by the MAC unit
102
to provide the result of the inner-product operation.
The recent processor used in a work station, personal computer or the like is designed to effect a single-instruction multiple data stream (SIMD) type operation in units of a sub-word for a higher speed of the image processing and sound processing. In the SIMD type operation, a word-long data (one word is 32 or 64 bits long) stored in the register file is divided into a plurality of data each of a predetermined number of bits for arithmetic operation. Each of the data resulted from the division of a word-long data is called “sub-word”.
The digital-filter operation, that is, inner-product operation, can be done faster by the combination of a division of a data into sub-words with an inner-product operational unit which effects the SIMD type operation. The digital-filter operation is used for image processing and sound processing among others. It is continuously effected on a series of data in many cases. Thus, to effect a digital-filter operation by the SIMD type operation, a source data to be calculated and a coefficient data by which the source data is multiplied are stored in units of a sub-word into an input register of the inner-product operational unit.
A typical inner-product operation of the SIMD type will be explained below with reference to FIG.
2
. The input register of the inner-product operational unit is supplied with a 64-bit source data and 64-bit coefficient data, for example, in units of a 16-bit sub-word, respectively. The source data consisting of four 16-bit sub-words X
0
, X
1
, X
2
and X
3
counted from the least significant bit (LSB) is stored into a first input register
111
. The coefficient data consisting of four 16-bit sub-words C
0
, C
1
, C
2
and C
3
counted from the most significant bit (MSB) is stored into a second input register
112
. The inner-product operational unit multiplies and accumulates the source data consisting of the four 16-bit sub-words and coefficient data correspondingly consisting of four 16-bit sub-words, on a multiply and accumulate (MAC) instruction (pmaddwd), and stores the result of the multiplication and accumulation (product-sum) into a first intermediate register
113
. X
2
×C
2
+X
3
×C
3
is stored as the result of the multiplication and accumulation at the higher 32 bits (two sub-words) in the first intermediate register
113
while X
0
×C
0
+X
1
×C
1
is stored as the result of the multiplication and accumulation at the lower 32 bits (two sub-words) in the first intermediate register
113
, as shown in FIG.
2
. Next, on a data-transfer instruction (movq), the inner-product operational unit copies the content of the first intermediate register
113
to a second intermediate register
114
. Then, on a shift instruction (psrlq), the inner-product operational unit shifts to the right the data in the first intermediate register
113
by one sub-word, that is, by 32 bits (namely, shifts the data from the higher place to the lower place). Further, on an add instruction (paddd), the inner-product operational unit adds the higher 32 bits and lower 32 bits in the first and second intermediate registers
113
and
114
, and stores the result of the addition at the higher 32 bits and lower 32 bits, respectively, in an output register
115
.
As the result of the arithmetic operation, X
0
×C
0
+X
1
×C
1
+X
2
×C
2
+X
3
×C
3
, the result of the inner-production operation by the SIMD type operation, is stored at the lower 32 bits in the output register
115
. Note that the data stored at the higher 32 bits in the output register
115
are independent of the inner-product operation.
The processor used in the work station, personal computer, etc., has to frequently effect a continuous digital-filter operation of a source data such as a series of images, sounds, etc. In this case, for such a continuous digital-filter operation, there are provided a plurality of input registers having stored therein coefficient data shifted by a sub-word from each other, and a source-data input register. The coefficient data whose bit positions have been shifted are read from each of the coefficient-data input registers each time an inner-product operation instruction is issued, and a source data whose bit positions are fixed is multiplied by a coefficient data of which the bit positions have been shifted, thereby permitting the digital-filter operation to be done at a high speed. Also, there are provided a coefficient-data input register and a source-data input data constructed as a shift register capable of storing a two-word data, a source data of which the bit positions have been shifted each by one sub-word is read each time an inner-product operation instruction is issued, and a coefficient data of which the bit positions are fixed is multiplied by a source data of which the bit positions have been shifted, thereby permitting the digital-filter operation to be done at a high speed.
The inner-product operation has been described in the foregoing. The SIMD type operation can be done for the arithmetic and logical operations by the common ALU such as addition, subtraction, etc. as well.
However, the above-mentioned arithmetic operation is disadvantageous as will be described below:
For example, when a series of arithmetic operations is done, the results of the operations are stored in a plurality of intermediate registers and output register. That is, many registers are required for this data storage.
Also, even with an arithmetic operation done by the SIMD type one, the result of the operation will be given in units of a word, not in units of a sub-word in which the data has been stored into the input register. Thus, when the SIMD type operation is continuously done, a word-long source data has to be re-formed into sub-words by shifting the bit positions of the output data and packing the data, which will lead to an increased number of cycles of operation. In addition, the number of program codes will be increased and the program memory will be increased in size.
OBJECT AND SUMMARY OF THE INVENTION
It is therefore an object of the present invention to overcome the above-mentioned drawbacks of the prior art by providing an information processor in which the result of an arithmetic operation can be provided as sub-words each having an arbitrary data length and thus the operation can be completed with a reduced number of execution cycles.
According to t
Mogi Yukihiko
Nishibori Kazuhiko
Ngo Chuong Dinh
Sonnenschein Nath & Rosenthal LLP
Sony Corporation
LandOfFree
Information processor does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Information processor, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Information processor will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3332969