Electrical computers: arithmetic processing and calculating – Electrical digital calculating computer – Particular function performed
Reexamination Certificate
2001-03-19
2001-12-04
Mai, Tan V. (Department: 2121)
Electrical computers: arithmetic processing and calculating
Electrical digital calculating computer
Particular function performed
C708S505000, C708S507000
Reexamination Certificate
active
06327605
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a data processor specialized for inner product operation or matrix operation, and further to a data processing system optimal for three-dimensional graphics control, and relates to a technology effective as applied for a data processor for executing application frequently using floating point number vector or matrix less than or equal to length 4, for example.
In three-dimensional graphics and so forth, matrix operation employing 4×4 transform matrix for rotation, expansion, contraction, perspective projection and parallel translation and so forth of a graphic pattern may be often utilized, and inner product operation may also be utilized for determining brightness of a light receiving surface, and so forth. Repeating of multiply-add operations is necessary for such matrix operation or inner product operation. Also, in concerning data to be handled in three-dimensional graphics, floating point number has been conventionally used in a high-end system. Even in the field having severe constraint of cost, such as a game machine, handheld PC and so forth, the handling data is shifting from integer to floating point number method. This is because that floating point number facilitates programming and is adapted to higher level process.
2. Description of the Related Art
A multiply-add unit is designed to perform operation of (A×B)+C as single function. For example, “PA-8000 Combines Complexity and Speed”, Microprocessor Report, Vol 8, No. 15, Nov. 14, 1994, pages 6 to 9, there has been disclosed a processor employing the multiply-add unit, in which parallelism of the multiply-add unit has been 2.
In “Nikkei Electronics” (Nikkei PB K.K.) No. 653, Jan. 15, 1996, pages 16 to 17, there has been disclosed a semiconductor integrated circuit, in which three-dimensional drawing function is integrated on one chip. In the disclosed semiconductor integrated circuit, a multiply-add unit performing operation of eight fixed point number data in one cycle, has been incorporated thereinto. Also, there is a disclosure that transformation of coordinates utilizing 4×4 matrix can be processed in two cycles.
On the other hand, JP-A-64-3734 discloses a multiplier circuit constituted of four multipliers and an adder summing the outputs of four multipliers with matching digits. Since the multiplier circuit is adapted to process multiplication of basic word length and double word length. Therefore, digit matching function is simply specialized for this process and thus, inner product operation of floating point number cannot be performed.
In JP-A-5-150944, a digital signal processor having a plurality of multiply-add units and means for connecting therebetween has been disclosed. The digital signal processor is adapted for integer.
Also, JP-A-5-216657 discloses a high speed processor for digital signal processing. There is a disclosure for geometry process employing a multiply-add unit for floating point number by the high speed processor.
On the other hand, JP-A-5-233228 discloses a floating point arithmetic unit and operation method thereof. There is a disclosure of means for reducing size of a floating point unit. However, since the disclosed system makes a multiplying array into half to use twice, the performance becomes half. Since components other than multiplying array are not reduced the sizes, an area-to-performance ratio of the floating point unit is lowered.
All of the above set forth have not considered speeding up of 4×4 matrix operation or inner product operation, at all.
SUMMARY OF THE INVENTION
The inventor has studied for speeding up of matrix operation and inner product operation employing floating point number. According to this, it has been found that since a multiply-add unit of floating point number has large circuit scale, if they are simply arranged in parallel, increasing of the circuit scale becomes significant, and thus, as disclosed in the first publication, “PA-8000 Combines Complexity and Speed”, the possible parallelism is in the extent of two to limit speeding up. On the other hand, in the content of the disclosure of the second publication, “Nikkei Electronics”, transformation of coordinates using a 4×4 matrix can be processed in two cycles to achieve speeding up in certain extent. However, for using an integer multiply-add unit having small number of bits, it is inherent to sacrifice precision of operation.
An object of the present invention is to provide a data processor which can speed up matrix operation and inner product operation employing a floating point number.
Another object of the present invention is to provide a data processor which can perform matrix operation or inner product operation employing a floating point number at high precision and high speed.
Typical one of inventions disclosed in the present application will be briefly explained as follow.
Namely, a data processor comprises an arithmetic portion incorporated in a floating point unit, including a plurality of multipliers supplied mantissa part of floating point number from respectively different data input signal line group and performing mutual multiplication of supplied mantissa parts, an aligner receiving outputs of respective multipliers and performing alignment shift, an exponent processing portion for generating number of alignment shift of the aligner and an exponent before normalization on the basis of generation an exponent part of the floating point number, a multi-input adder for adding the outputs of the aligner, and a normalizer for normalizing the output of the multi-input adder and the exponent before normalization.
By making multiplication by a plurality of multipliers and addition of the results of multiplication parallel, the data processor can speed up the inner product operation or vector transforming operation. Also, the inner product can be obtained by one parallel multiplication and addition. Also, since the inner product can be derived by multiplication and addition at once to avoid necessity of process, such as rounding, to be executed at every multiply-add operation for two inputs to shorten latency of inner product operation. Furthermore, accuracy of arithmetic operation becomes high. Also, it will not happen to differentiate the results of arithmetic operations as that happen when rounding per every multiply-add operation is repeated. Furthermore since the data processor requires one circuit for normalization and the like, increasing of the circuit scale can be restricted as much as possible while inner product operation and vector transforming operation of floating point can be performed at high speed and with high accuracy.
In order to efficiently perform process for negative number in parallel multiplication and addition of floating point number, it is preferred that the arithmetic portion further includes a sign processing portion generating a sign with respect to a result of multiplication of each multiplier in response to the sign of floating point number multiplied by each multiplier, the aligner includes a selector selectively outputting the result of alignment shift in inverting or non-inverting manner for selecting an inverted output when the sign of the result of multiplication is negative, and the multi-input adder generates a carry for adding +1 to the output of aligner corresponded to negative sign with respect to the result of multiplication to perform complement process of two for the negative result of multiplication.
The objective data for arithmetic operation by the arithmetic portion and resultant data of arithmetic operation are temporarily stored in a register file. At this time, in order to enable such process without increasing number of ports of the register and number of bits of a register designation field, the register file is provided a register bank structure so that a plurality of register banks or the registers of single bank are connected to respective input terminals of the multipliers.
Paying attention for inner p
Arakawa Fumio
Nakagawa Norio
Totsuka Yonetaro
Yamada Tetsuya
Antonelli Terry Stout & Kraus LLP
Hitachi , Ltd.
Mai Tan V.
LandOfFree
Data processor and data processing system does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Data processor and data processing system, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Data processor and data processing system will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2559373