Electrical computers: arithmetic processing and calculating – Electrical digital calculating computer – Particular function performed
Reexamination Certificate
2002-08-12
2003-11-11
Mai, Tan V (Department: 2124)
Electrical computers: arithmetic processing and calculating
Electrical digital calculating computer
Particular function performed
C708S503000
Reexamination Certificate
active
06647404
ABSTRACT:
FIELD OF THE INVENTION
The invention relates to the field of multiplier hardware for digital computer systems. In particular it relates to multipliers that use multiple passes of single precision hardware to perform double precision multiplication.
BACKGROUND OF THE INVENTION
Array multipliers, such as Wallace-tree multipliers with and without Booth recoding, require a gate count that grows proportionally to the square of the number of bits of each operand. Hence a 53-bit multiplier requires close to three times the area as does a 32-bit multiplier array.
The number of logic levels in the worst-case data path through a 53-bit multiplier array is greater than the number of logic levels in the worst-case data path of a 32-bit multiplier array. This means that, all other factors being equal, a 53-bit multiplier is slower than a 32-bit multiplier array.
Most modern processors have 32-bit integer multiply instructions. Several common specifications for computer floating point operations, including the IEEE 754 and Digital VAX specifications, require a 24-bit (including one hidden bit) mantissa for single precision floating point, and a 53-bit mantissa for double precision floating point. These machines therefore need to be able to quickly multiply pairs of 32 bit operands, pairs of 24-bit operands, and pairs of 53-bit operands in performing their integer and floating point arithmetic. These machines also must be able to quickly add 32-bit integer operands, as well as 53 bit and 24 bit denormalized mantissas.
It is known that a 32-bit hardware multiplier can be used to perform a 53-bit unsigned multiply by performing a sequence of four multiply operations, each generating a partial product, and summing the partial products. A way that this may be done is as follows:
[Bits 52 . . . 32 = A] [Bits 31 . . . 0 = B]
Break down the multiplicand
[Bits 52 . . . 32 = C] [Bits 31 . . . 0 = D]
Break down the multiplier
T1 = B * D
Compute partial products
T2 = D * A * 2**32
and perform shift
T3 = C * B * 2**32
T4 = A * C * 2**64
T6 = T1 + T2 + T3
T5 = T6 + T4
Sum the partial products,
extending T1, T2, T3, and
T4 as required.
Product = top 53 bits (T5)
Drop the least significant bits.
T5 is nominally 106 bits wide. When performing a floating point operation, it is necessary that the product bits be aligned into the correct bit positions for the mantissa of the floating point result. If T1, T2, T3, and T4 are summed in an adder, a left shift of 12 bit positions of T4 relative to T2 and T3 is required for proper alignment.
Some computer programs require more single-precision floating point and integer multiply operations than double-precision floating point operations. Machines running these programs can provide fast single precision and integer multiply operations with slower but adequate double precision multiply using a sequence of four multiply operations in each double-precision floating-point multiply instruction.
The Booth recoding algorithm is commonly used in large multiplier arrays to hold down the number of partial products that must be added during a multiply. While a classic thirty two bit Wallace-tree binary array multiplier generates and adds thirty two partial products, an array multiplier using two-bit Booth recoding need generate and adds only half as many partial products, although the logic required for generating each partial product is somewhat more complex. Reducing the number of partial products not only can reduce the number of gates in the array, but produces a faster multiplier by reducing the number of gate delays in the worst-case critical path through the array.
In the basic binary two-bit Booth algorithm, a pair of bits of the multiplier are considered for each partial product. If those bits are zero, the partial product is zero. If those bits are one, the partial product is the multiplicand. If those bits are two, the partial product is a single-bit-shifted (multiplied by two) multiplicand. If those bits are three, the partial product is minus the multiplicand, with one added in the next partial product—giving a net partial product term of four times the multiplicand minus the multiplicand equaling three times the multiplicand.
A common version of the Booth multiply is the modified booth recoding multiply. In this version, the multiplier is recoded from a binary number, where each digit is a 1 or a 0, to a number having fewer digits each in the range {−2, −1, 0, +1, +2}. Each bit pair {B
n+1
,B
n
} of the multiplier is transformed by a three-input digit-encoder circuit that considers bit {B
n−1
} of next lower significance to the bit pair being encoded according to the formula:
+2 if (B
n−1
& B
n
& ~B
n+1
)
110
+1 if ((B
n−1
xor B
n
) & ~B
n+1
)
010|100
−1 if ((B
n−1
xor B
n
) & B
n+1
101|011
−2 if (~B
n−1
& ~B
n
& B
n+1
)
001
0 otherwise
000|111
The partial products of the multiplier (booth_digit*multiplicand) are generated by taking a zero, a left shift of the multiplicand (+2), the multiplicand (+1), the negation of the multiplicand (−1) or the negation of a left shift of the multiplicand (−2). One such partial product is generated for each bit pair of the multiplier, these partial products are shifted appropriately and summed to generate the product. For modern array multipliers, the product is generated by an array of carry-save adders of structure similar to a Wallace tree.
In the Booth recoding circuit described above, a term Bn−1 is used. In all bit-pairs except the least significant bit pair, this bit is the most significant multiplier bit of the bit pair of less significance than the bit pair being encoded. In the least significant bit pair, this bit is normally a zero. In multipliers that perform a multiply of high precision from a sequence of lower-precision multiplies, it is known that this bit may be used as a carry input to the higher digits of the multiply.
SUMMARY OF THE INVENTION
A multiplier has been constructed that performs a high precision multiply by performing a sequence of four lower-precision multiplies, each of the four generating a partial product, and summing the partial products. This multiplier breaks down the a multiply into a sequence of four multiply operations.
[Bits 52 . . . 32 = A] [Bits 31 . . . 0 = B]
Break down the multiplicand
[Bits 52 . . . 32 = C] [Bits 31 . . . 0 = D]
Break down the multiplier
P1 = (B * D) right shifted by 64 bits
Compute partial products
P2 = (D * A) right shifted by 32 bits
and perform shift
P3 = (C * B) right shifted by 32 bits
P4 = (A left shifted by 6 bits) *
(C left shifted by 6 bits)
Product = (P1 + P2 + P3 + P4)
Sum the partial products.
The multiplier performs three initial cycles to compute the partial products P1, P2, and P3.
The left-shift-by-12 bit positions required in the example to align the resulting 53 bits of product is accomplished by the left shift of each of the A and C operands 6 bit positions, this places the significant bits of P4 at the appropriate inputs of the adder used to sum the partial products. The top 22 bits of partial sums P3, P2, and the effect of P1, are positioned in the 22 least significant bits of the adder by shift operations.
The multiplier used for all four cycles of the multiply is Booth encoded. Each two-bit booth recoder cell requires three inputs, (Bn+1, Bn, Bn−1), so that an add of three times the multiplicand can be represented as a subtract of one times the multiplicand with an add of four times the multiplicand. It may therefore be necessary to force an extra addition when the multiplier is split into fields as in this design.
To get the correct result using the booth encoded multiplier, two techniques have been used.
Chng Choon Ping
Tzeng Tzungren Allan
Gunnison McKay & Hodgson, L.L.P.
Mai Tan V
McKay Philip J.
Sun Microsystems Inc.
LandOfFree
Double precision floating point multiplier having a 32-bit... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Double precision floating point multiplier having a 32-bit..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Double precision floating point multiplier having a 32-bit... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3170647