Floating point exception handling in pipelined processor...

Electrical computers and digital processing systems: processing – Processing control – Branching

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C703S027000, C712S222000, C712S227000

Reexamination Certificate

active

06826682

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to computer systems and, more particularly, to methods for accelerating floating point operations in computer systems.
2. History of the Prior Art
Recently, a new microprocessor was developed which combines a simple but very fast host processor (called a “morph host”) and software (referred to as “code morphing software”) to execute application programs designed for a “target” processor having an instruction set different than the instruction set of the morph host processor. The morph host processor executes the code morphing software to translate the application programs into morph host processor instructions which accomplish the purpose of the original target software. As the target instructions are translated, the new host instructions are both executed and stored in a translation buffer where they may be accessed without further translation. Although the initial translation of a program is slow, once translated, many of the steps normally required for hardware to execute a program are eliminated. The new microprocessor has demonstrated that a simple fast processor designed to expend little power is able to execute translated “target” instructions at a rate equivalent to that of the “target” processor for which the programs were designed.
In order to be able to run programs designed for other processors at a rapid rate, the morph host processor includes a number of hardware enhancements. One of these enhancements is a gated store buffer which resides between the host processor and the translation buffer. A second enhancement is a set of host registers (in addition to normal working registers) which store known state of the target processor existing prior to any sequence of target instructions being translated. Memory stores generated as sequences of translated morph host instructions are executed are placed in the gated store buffer. If the morph host instructions execute without raising an exception, the target state at the beginning of the sequence of instructions is updated to the target state at the point at which the sequence completed and the memory stores are committed to memory.
On the other hand, if an exception is raised during execution of the morph host instructions, execution stops, the host processor rolls back operation to the last point at which target state was known to be correct, and execution proceeds from that point utilizing a process (an interpreter in one embodiment) which accomplishes step-by-step translation of each of the target instructions. This process essentially single steps through the execution of target instructions. As each target instruction is translated and executed, the state of the target processor is brought up to date. The process continues during the translation and execution of the remainder of the sequence of target instructions until the exception reoccurs. When the exception reoccurs, target state will be correct for handling the exception. The use of these hardware enhancements with the rollback process allows exceptions to be accurately handled while dynamic translation of target instructions is taking place. The improved processor is described in detail in U.S. Pat. No. 5,958,061, entitled
Combining Hardware And Software to Provide An Improved Microprocessor
, R. Cmelik et al., issued Feb. 29, 2000, and assigned to the assignee of the present invention.
A problem which has occurred with the new processor relates to the execution of floating point operations translated from instructions originally programmed for a target processor. Floating point processors execute some mathematical operations quite rapidly. For example, multiplication of floating point values requires simply adding exponents consisting of zeroes and ones and multiplying the mantissas by shifting a binary point. On the other hand, addition of mantissas requires a pre-normalization step of aligning binary points, an addition, and finally a post-normalization step of realigning the binary point. Consequently, most floating point operations require a number of clock cycles and are therefore somewhat slow. In fact, all operations other than square root and division require four clock cycles to execute utilizing the new microprocessor. Division and square root operations take an indeterminate amount of time and may require halting the operations of the processor until they complete.
Because floating point operations require a number of clock cycles to execute, most modern floating point processors (including the floating point processor unit of the new microprocessor) pipeline floating point operations. Pipelining executes a number of floating point operations in parallel and usually starts a new floating point operation on each succeeding clock cycle. The effect of running operations in parallel which start on sequential clocks is to produce one floating point result for each clock cycle during most sequences of floating point operations.
Modern floating point processors not only pipeline operations but also attempt to reorder floating point operations to attain even greater speed. However, floating point operations are difficult to reorder. Not only do floating point processors produce a numerical result as output for each operation, they also typically provide a number of status bits which indicate whether the result should raise an exception. These status bits indicate whether an operation caused an overflow or an underflow, whether an operation was invalid, whether an operand was not in a normal number format (i.e., was “denormal”), whether the operation attempted a divide by zero, and whether the precision provided by the result is inexact. Each of these conditions could require exceptional handling in order for the result to be correct. A user may arm or disarm individual exceptions to produce the results desired. The precise exceptions are defined by the floating point standard of IEEE 754.
When translating target instructions designed for execution by a target processor, it is necessary to provide instructions which produce the same results as would the target processor. For example, if the target instructions are designed to be executed by an Intel X86 processor, then the translated instructions should produce the same results as would be produced by an X86 processor. The early Intel X86 processors (more particularly, the X87 floating point unit) handled floating point operations one at a time and generated both a result and status bits for that result immediately after each individual floating point operation. X86 processors have continued to function in this manner.
Consequently, it is necessary for the new processor when translating X86 floating point instructions to provide the same status bits which are correct for each result as the result issues.
Providing correct status bits with each result as the result issues is especially difficult when pipelining floating point operations since the status bits for a floating point operation are not known until the floating point operation completes, typically four cycles after commencing. The prior art has found no solution to the problem of producing accurate status bits with each result produced other than to terminate pipelining of floating point operations and handle floating point operations one at a time.
Providing correct status bits with each result while pipelining operations in the new processor is not only difficult because of the delay in generating status bits, the condition of status bits also complicates floating point operations which have been reordered to a position in a sequence of operations at which state is to be committed by the new processor. In order to function correctly, the status bits must be correct not only for those floating point operations which have executed in their normal order but also for those floating point operation which have been reordered before state including the status bits can be committed.
Although the prior art has not been able to provide correct status bits without stopping the

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Floating point exception handling in pipelined processor... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Floating point exception handling in pipelined processor..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Floating point exception handling in pipelined processor... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3360941

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.