Method and apparatus for performing division and square root...

Electrical computers: arithmetic processing and calculating – Electrical digital calculating computer – Particular function performed

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Method and apparatus for performing division and square root... Method and apparatus for performing division and square root...

: 2001-06-07
: 2004-08-24
: Mai, Tan V. (Department: 2124)
: Electrical computers: arithmetic processing and calculating
: Electrical digital calculating computer
: Particular function performed

: C708S654000
: Reexamination Certificate
: active
: 06782405
: ABSTRACT:

TECHNICAL FIELD OF THE INVENTION
This invention relates in general to the field of integrated data processing, and more specifically to an improved method and apparatus for performing division and square root functions in a system using a multiplier and a multipartite table.
BACKGROUND OF THE INVENTION
The arithmetic unit is one of the most important components of any integrated data processing system. Arithmetic units perform a wide variety of mathematical functions upon operands that are transmitted from other portions of an integrated system. The basic addition, subtraction, and multiplication functions are quickly and efficiently performed in arithmetic units today. Known techniques for performing division and square root functions, however, have not been completely satisfactory with respect to speed and efficiency.
SUMMARY OF THE INVENTION
The speed of performing basic mathematical functions is significant to the efficiency of a data processing system, since the system may need to perform basic mathematical functions many thousands of times per second for a variety of common applications. Multimedia applications typically require relatively low precision division and square root operations, while scientific applications require much higher precision for these operations. The IEEE floating point standard specifies accuracy of a 24 bit fraction in a 32 bit word for single precision and a 53 bit fraction in a 64 bit word for double precision for all basic arithmetic operations including division and square root. Reducing the number of cycles needed to perform one division or square root operation reduces the overall required number of cycles, increasing efficiency.
Known methods for performing division and square root operations produce a single precision result after several iterations and a double precision result after additional iterations. These methods typically incur an initial delay for generating a table lookup operand in addition to a sequence of delays due to dependent additions and/or multiplications between successive iterations. Additive methods perform an iteration in one or two cycles, but require tens of cycles for single and double precision results. More aggressive multiplicative methods employ only a handful of iterations, but operation dependencies result in three to five cycles of delay per iteration.
A common method of performing division and square root operations employs SRT procedures in dedicated hardware. This method is used in, for example, the Pentium family of microprocessors manufactured by Intel Corporation. The procedure generates digits iteratively in a small radix similar to long division. The method, due to iterative dependencies, is extremely slow, typically taking some fifteen cycles for single precision and some thirty cycles for double precision to complete a division operation.
Pipelined multipliers can be used to reduce the number of required cycles. Pipelined multipliers complete a multiplication to double precision in three to five cycles. More importantly, pipelined multipliers typically allow a new multiply operation to start every cycle in a procedure analogous to an assembly line. Known division methods, however, require four to ten dependent multiplications for the highest precision result, where a second multiply must wait for the first multiply to finish and feed back the result as a new operand for the subsequent multiply operation. An independent multiplication, in contrast, does not need to wait for the result of a previous multiply operation and can begin at the next cycle. The feedback latency of three to five cycles is the delay time cost for dependent multiplies, compared with just one cycle for independent multiplies.
The convergence methods of division and square root are preferred in a pipelined unit, because they require fewer dependent multiplications. Known convergence methods for division and square root start from a table lookup approximation of the reciprocal of the denominator (for division) or of the square root reciprocal of the operand (for square root) and perform multiplications to refine the approximation, doubling the accuracy with each iteration.
A known convergence technique for division uses a single lookup table and employs three iterations to refine the approximation of a quotient x/y. First, a table lookup value t·1/y is determined, then the numerator x and the denominator y of quotient q=x/y are scaled by the table lookup value, yielding:
x
y
·
t
t
=
q
7.5
(
1
-
ϵ
)
where *&egr;* # 2
−7.5
, and q
n
is an approximation of q=x/y to an n-bit accuracy. Here, the accuracy is determined by the table lookup value. Note that the multiplication operation is delayed by the latency of the table lookup value. Second, the result of the first approximation is scaled by (1+&egr;):
q
7.5
(
1
-
ϵ
)
·
(
1
+
ϵ
)
(
1
+
ϵ
)
=
q
15
(
1
-
ϵ
2
)
where (1+&egr;) is the 2's complement of the scaled y value from the previous iteration, that is, (1+&egr;)=2−yt. In the third and final iterations, the approximations are scaled two more times:
q
15
(
1
-
ϵ
2
)
⁢
·
(
1
+
ϵ
2
)
(
1
+
ϵ
2
)
=
⁢
q
30
(
1
-
ϵ
4
)
·
(
1
+
ϵ
4
)
(
1
+
ϵ
4
)
=
⁢
q
60
(
1
-
ϵ
8
)
⁢

where (1+&egr;
2
) and (1+&egr;
4
) are the 2's complement of the scaled y value from the previous iteration, that is:
(1+&egr;
2
)=2
−[yt
(2
−yt
)]
and
(1+&egr;
4
)=2
−[yt
(2
−yt)][
2
−yt(
2
−yt)]
Note that the accuracy, “n”, doubles with each iteration.
The value q
30
determined after the third iteration is sufficient for single precision. The final denominator scaling (1−&egr;
2
)×(1+&egr;
2
)=(1−&egr;
4
) for single precision or (1−&egr;
4
)×(1+&egr;
4
)=(1−&egr;
8
) for double precision need not be computed, but this does not lessen the number of dependent multiplies. Well known rounding procedures are then applied to q
30
or q
60
to determine the final result with several additional cycles delay.
Convergence division implementations typically employ three to four iterations to obtain the desired accuracy, starting with a fairly small table with, for example, a seven to nine bit accuracy. Use of larger tables becomes impractical as the table size grows exponentially with the number of bits. Table computations, such as linear interpolation, allow a larger starting accuracy and a reduced number of iterations at the cost of the multiple cycle delay of the table lookup computations. Known techniques for conversion division still take 15 to 30 cycles due to the succession of pipeline delays for dependent multiplication.
Therefore, a need has arisen for a system that performs division and square root functions and yields approximate quotients and square roots of accuracy sufficient for known single and double precision rounding procedures, but which substantially reduces the number of cycles lost to dependencies between the table lookup and successive multiplications.
In accordance with the present invention, a method and system for performing division and square root functions are provided which substantially eliminate or reduce the disadvantages and problems associated with prior arithmetic techniques used to perform division and square root functions.
The division and square root systems of the present invention comprise a multiplier. The systems also comprise a multipartite table system, a folding inverter, and a complement inverter, each coupled to the multiplier. More specifically, the multipartite lookup table may comprise a tripartite lookup table. The present invention may further comprise data links coupling the parts of the system and a rounder that rounds the final approximation.
According to one embodiment of the present invention, the division an

Affiliated with

Iordache Cristina S.

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Matula David W.

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Baker & Botts L.L.P.

Law Firm

[ 0.00 ] – not rated yet Voters 0 Comments 0

Mai Tan V.

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Southern Methodist University

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for performing division and square root... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for performing division and square root..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for performing division and square root... will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-3334388

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure