Electrical computers: arithmetic processing and calculating – Electrical digital calculating computer – Particular function performed
Reexamination Certificate
2001-04-23
2004-08-31
Ngo, Chuong Dinh (Department: 2124)
Electrical computers: arithmetic processing and calculating
Electrical digital calculating computer
Particular function performed
C708S204000
Reexamination Certificate
active
06785701
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a floating-point operator, and more particularly to an apparatus and method of performing conversion and IEEE rounding in parallel for a floating-point arithmetic logical unit (ALU) using a simultaneous rounding method (SRM).
2. Description of the Related Art
Generally, a floating-point operator is essentially used in a graphic accelerator, digital signal processor (DSP), and a computer requiring a high performance, and most floating-point operators are provided with floating-point adders, multipliers, dividers, and square-root extractors.
Since the operation using the floating-point adder is given the most weight in the floating-point operator, the floating-point adder holds the most important part of the floating-point operator, and has a great effect upon the whole floating-point operation.
The floating-point ALU operation includes a comparison, conversion, round, and other simple logical operations in addition to the addition/subtraction performed in the floating-point adder, and these operations are implemented by adding separate hardware to the floating-point adder.
Especially, a conversion and round (CR) operation that includes two conversion operations and around operation has the most complicated processing structure in the ALU operation, and has difficulty in implementation.
With reference to the publication of IEEE Std 754-1985, “IEEE standard for binary floating-point arithmetic,” IEEE, 1985, the CR operation briefly divided into conversion (ftoi) of a floating-point number to an integer type number, conversion (itof) of an integer type number to a floating-point number, and round operation (rnd) of a floating-point number.
As explained in D. Goldberg, “Computer arithmetic,” Appendix A of J. L. Hennessy, and D. A. Patterson, Computer architecture: a quantitative approach, MorganKaufmann Publishers Inc, 1996., the fractional process in the floating-point addition is composed of a first stage of alignment, second stage of addition, third state of normalization, and fourth stage of rounding.
Recently, various results of research for high-speed floating-point addition have been announced. They are [1] W. C. Park, S. W. Lee, O. Y. Kown, T. D. Han, and S. D. Kim, “Floating-point adder/subractor performing IEEE rounding and addition and addition/subtraction in parallel,” IEICE Trans. Information and Systems, vol. E79-d, no. 4, pp. 297-305, April 1996.; [2] W. C. Park, T. D. Han, S. D. Kim, efficient simultaneous rounding method removing sticky-bit from critical path for floating-point addition, in proceedings of AP-ASIC, pp.223-236, August 2000.; [3] A. Beaumont-smith, N. Burgess, S. Lefrere, and C. C. Lim, “Reduced latency IEEE floating-point standard adder architectures,” In Proceedings of IEEE 14
th
Symposium on Computer Arithmetic, April 1999.; [4] N. Quach and M. Flynn. “Design and implementation of the snap floating-point adder,” Technical Report CSL-TR-91-501, Stanford University, December 1991.; and [5] Peter-M. Seider and G. Even, “How many logic levels does floating-point addition require?”, “In Proceedings of IEEE 14
th
Symposium on Computer Arithmetic”, April 1999. According to the above-described researches, it is understood that the performance of the operator can be improved by performing the fourth stage of rounding prior to the third stage of normalization.
This means that the fourth stage of rounding is performed in the same pipeline as the second stage of addition, and in the following description, it is called a simultaneous rounding method (SRM).
In case of using such an SRM, a separate adder for rounding operation is not required, and re-normalization caused by an overflow during the general rounding operation is not produced, thereby improving the performance of the operator.
Especially, since the processing structure disclosed in the published documents [1], [2], and [3] is simple and has a greatly reduced complexity, it is efficient in hardware size and performance.
Now, the above-described rounding operation and the CR processes of a general floating-point adder and an SRM type floating-point adder will be explained one by one with reference to the accompanying drawings.
In case of the rounding operation, according to the IEEE standard, the floating-point number is classified into a single precision type composed of 32 bits and a double precision type composed of 64 bits in expression.
The single precision type is composed of a sign bit of one bit, exponent part of 8 bits, and fraction part of 23 bits. According ID to the normalized form of the fraction part, the most significant bit (MSB) is “1”, and the MSB is omitted in the floating-point expression. This MSB is called a hidden bit.
Specifically, in the normalization process of the floating-point addition, the fraction part of one of two input floating-point numbers, whose exponent part is smaller than or equal to that of the other floating-point number, is shifted in the least significant bit (LSB) direction as much as the difference between the exponent parts of the two floating-point numbers, and at this time, for a proper round in the IEEE standard, information of the fraction part to be lost should be held.
For this, 3 bits, i.e., a guard (G) bit, round (R) bit, and sticky (Sy) bit are defined as proposed in D. Goldberg, “Computer arithmetic,” Appendix A of J. L. Hennessy and D. A. Patterson, Computer architecture: a quantitative approach, MorganKaufmann publishers Inc, 1996.
At this time, the guard (G) bit has a weight value smaller than the LSB, and becomes the MSB of the fraction part to be lost. The round (R) bit becomes the next bit, and the sticky (Sy) bit becomes the value obtained by OR-gating all bits of the part to be lost except for the guard (G) bit and the round (R) bit.
In the above-described IEEE standard, four rounding types such as a round-to-nearest, round-to-zero, round-to-positive-infinity, and round-to-negative-infinity are provided.
Equation 1 represents a result of rounding for the respective rounding mode with the input of the LSB, Guard (G), round (R), and sticky (Sy) bits of the fraction part produced after passing through the alignment, addition, and normalization in the floating-point addition. Here, if the result of Equation 1 is “0”, it means a round down of the rounding result, and if the result of Equation 1 is “1”, it means a round up of the rounding result.
Round
mode
(LSB, G, R, Sy) [Equation 1]
In more detail, in case of the round-to-nearest, if the guard (G) bit is “0”, the round down is effected, while if the guard (G) bit is “1”, and the round (R) bit or the sticky (Sy) bit is “1”, the round up is effected. Also, if the guard bit is “1”, the round bit and the sticky bit are “0”, and the LSB is “1”, the round up is effected, while otherwise, the round down is effected.
In case of the round-to-zero, the round down is effected.
In case of the round-to-positive-infinity, if the sign (S) bit is “0”, and the guard (G) bit, round (R) bit, or sticky (Sy) bit is “1”, the roundup is effected. Otherwise, the rounddown is effected.
In case of the round-to-negative-infinity, if the sign (S) bit is “1”, and the guard (G) bit, round (R) bit, or sticky (Sy) bit is “1”, the roundup is effected. Otherwise, the round down is effected.
The above-described four rounding types can be divided into three modes according to the sign, that is, the round-to-nearest, round-to-zero, and round-to-infinity. Hereinafter, it is considered that in Equation 1 represents the three mode.
Now, the process of the general floating-point adder will be explained in detail with reference to the accompanying drawings.
FIG. 1
is a view illustrating pipelines of a conventional floating-point adder.
Exponent parts of two input floating-point numbers are defined as Ea and Eb, and fraction parts thereof are defined as Ma and Mb, respectively.
At this time, since the exponent part is processed by a simple
Han Tack Don
Park Woo Chan
Ngo Chuong Dinh
Sheridan Ross PC
Yonsei University
LandOfFree
Apparatus and method of performing addition and rounding... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Apparatus and method of performing addition and rounding..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Apparatus and method of performing addition and rounding... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3317262