Block-normalization in multiply-add floating point sequence with

Electrical computers: arithmetic processing and calculating – Electrical digital calculating computer – Particular function performed

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

708500, 708578, G06F 738

Patent

active

059999608

DESCRIPTION:

BRIEF SUMMARY
FIELD OF THE INVENTION

The invention relates to an arrangement and a method in a pipeline floating-point processor (FLPT) for improving the performance of a multiply-add sequence.


PRIOR ART

Floating-point processors (FLPTS) are used to be functionally added to a main processor (CPU) for performing scientific applications. The multiply-add sequence consists of a multiply instruction followed by an add instruction. The multiplication is performed within three cycles: operand read, partial sums build, and add the partial sums to end result, and the addition also needs three cycles: operand read, operands alignment, and addition. All instructions are hardware-coded, so that no micro-instructions are needed.
In the entry-level models (e. g. 9221) of the IBM Enterprise System/9000 (ES/9000), (IBM, Enterprise System/9000 and ES/9000 are trademarks of International Business Machines Corporation), the floating-point processor is tightly coupled to the CPU and carries out all IBM System/390 (System/390 is a trademark of International Business Machines Corporation) floating-point instructions.
FIG. 1 shows the data flow of the above mentioned floating point processor which is described in more detail in the IBM Journal of Research and Development, Vol. 36, Number 4, July 1992. While the CPU is based on a four stage pipeline, the floating-point processor requires a five stage pipeline to perform its most used instructions, e. g. add, subtract, and multiply in one cycle for double-precision operands (reference should be made to "ESA/390 Architecture", IBM Form No. : G580-1017-00 for more detail).
In fast Floating Point Units a `deep` pipelining is necessary. This has the disadvantage that in case of Source equal Target, that is when the successive instruction needs the result of the current instruction as input data, wait cycles will be necessary.
The CPU resolves operand addresses, provides operands from the cache, and handles all exceptions for the floating-point processor. The five stages of the pipeline are instruction fetch, which is executed on the CPU, register fetch, operand realignment, addition, and normalization and register store.
To preserve synchronization with the CPU, a floating-point wait signal is raised whenever a floating-point instruction needs more than one cycle. The CPU then waits until this wait signal disappears before it increments its program counter and starts the next sequential instruction, which is kept on the bus.
Because the IBM System/390 architecture requires that interrupts be precise, a wait condition is also invoked whenever an exception may occur. As can further be seen from FIG. 1, many bypass busses are used to avoid wait cycles when the results of the foregoing instructions are used. A wait cycle is needed only if the result of one instruction is used immediately by the next sequential instruction (NSI), e. g. when an add instruction follows a multiply instruction, the result of which has to be augmented by the addend of the add instruction.
The data flow shown in FIG. 1 has two parallel paths for fraction processing: one add-path where all non-multiply/divide instructions are implemented, and one multiply path specially designed for multiply and divide. The add-path has a fixed (60) bit width and consists of an operand switcher, an aligner, an adder, and a normalizer shifter. Instead of using two aligners on each side of the operand paths, a switcher is used to switch operands, thereby saving one aligner. The switcher is also needed for other instructions, and so, circuitry is reduced.
The multiplier path consists of a booth encoder for the 58-bit multiplier, a multiplier macro which forms the 58.times.60-bit product terms sum and carry, and a 92-bit adder which delivers the result product. The sign and exponent paths are adjusted to be consistent with the add path. The exponent path resolves all exception and true zero situations, as defined by the earlier cited IBM System/390 architecture.
The implementation of all other instructions is merged into the add path and mult

REFERENCES:
patent: 4562553 (1985-12-01), Colley et al.
patent: 4758974 (1988-07-01), Fields et al.
patent: 4926369 (1990-05-01), Hokenek et al.
patent: 5053631 (1991-10-01), Perlman et al.
patent: 5058048 (1991-10-01), Gupta et al.
patent: 5126963 (1992-06-01), Fukasawa
patent: 5204825 (1993-04-01), Ng
patent: 5267186 (1993-11-01), Gupta et al.
patent: 5317527 (1994-05-01), Britton et al.
patent: 5408426 (1995-04-01), Takewa et al.
patent: 5424968 (1995-06-01), Okamoto
patent: 5493520 (1996-02-01), Schmookleo et al.
patent: 5633819 (1997-05-01), Brashears et al.
patent: 5668984 (1997-09-01), Taborn et al.
patent: 5732007 (1998-03-01), Grushin et al.
patent: 5764549 (1998-06-01), Bjorksten et al.
patent: 5771183 (1998-06-01), Makineni
patent: 5867407 (1999-02-01), Wolrich et al.
Hokenek et al., "Leading-zero anticipator (LZA) in the IBM RISC System/600 floating-point execertion unit", IBM J. Res. Dev. vol. 34, No. 1, Jan. 1990, pp. 71-77.
Suzuki et al, "A 2.4-ns, 16-bit, 0.5-.mu.m CMOS aarithmetic log unit for microprogrammable video signal processor LSIs",May 9, 1993, proceedings of the custom integrated circuits conference San Diego, pp. 12.04.01-12.04.04, IEEE.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Block-normalization in multiply-add floating point sequence with does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Block-normalization in multiply-add floating point sequence with, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Block-normalization in multiply-add floating point sequence with will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-836167

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.