Method and apparatus for a late pipeline enhanced floating...

Electrical computers and digital processing systems: processing – Dynamic instruction dependency checking – monitoring or... – Reducing an impact of a stall or pipeline bubble

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Method and apparatus for a late pipeline enhanced floating... Method and apparatus for a late pipeline enhanced floating...

: 1998-10-28
: 2002-10-01
: Follansbee, John A. (Department: 2156)
: Electrical computers and digital processing systems: processing
: Dynamic instruction dependency checking, monitoring or...
: Reducing an impact of a stall or pipeline bubble

: Reexamination Certificate
: active
: 06460134
: ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to the field of digital electronic processors; more particularly, to processors designed for high-speed pipelined synchronous operation.
2. Description of the Related Art
Advances in digital electronic data processors have included process, circuit and microarchitecture improvements. Microarchitecture improvements have leveraged the fact that improving processes have allowed the integration of millions of devices on individual integrated circuits (ICs). With millions of devices available for very little cost, microarchitectures have evolved to deliver greater performance, even though individual devices contribute less to overall performance than in earlier microarchitectures.
One approach that researchers and scientists have investigated for improving performance is to increase the frequency at which the device operates. This is accomplished by using more aggressive circuit families such as dynamic gates, and by implementing fewer gates in each pipeline stage. Another approach is to implement multiple execution units such that more than one instruction can be executed at a time. Yet another approach is to allow each instruction to proceed at its own pace, beginning execution when encountered but not completing execution until all operands and resources are available. Pipelining all the functional units is known as super pipelining. Executing more than one instruction at a time is known as super scalarity, and allowing instructions to complete at their own pace is known as out-of-order. Most modern microarchitectures support some degree of each of these innovations. An efficient microarchitecture makes efficient use of its circuits. Super scalarity and out-of-order features reduce circuit efficiency because, in the case of super scalarity, multiple redundant units are present. That is, a super scalar design includes two or more ALUs, and both ALUs are used less frequently than one ALU is used in a uniscalar (non-super scalar) design.
Out-of-order features also reduce circuit utilization efficiency because of the additional control hardware required to track individual instructions and ensure that they appear to complete in the proper order as expected by the executing program. Super pipelining has the potential of increasing circuit utilization because it can operate individual circuits at a higher clock rate, but because of the synchronization requirements, i.e., latches, the percent of each stage doing actual work is reduced, which tends to offset efficiency gains. In otherwords, increased instruction latency decreases circuit efficiency.
Due to the complexity of modem processors, internal synchronization often requires pipeline stalls. This is because it is not necessarily known whether all resources required for an instruction to execute will be present when the instruction is initiated. For example, a store instruction may begin execution, generating an address and translating it from a virtual to a physical address before the register value to be stored is available. If the instruction gets to the point where the register value is required but is not yet available, the instruction will stall until such point as the register value appears. While these stalls are conceptually simple to implement, they often introduce critical paths into a design. Furthermore, the cost of such stalls is often overlooked as they represent additional circuitry present to manage instruction operation but not do any actual work. A stall, therefore, decreases circuit utilization.
Circuit utilization is important when designing processors that must balance performance against other issues such as silicon cost, power, heat dissipation and manufacturability. It is not particularly important when the only important criteria is absolute performance. When it is necessary to deliver the greatest performance for the minimum cost, for example, circuit utilization is an important metric. As such, super scalarity and out-of-order features are unattractive, and super pipelining is attractive only when synchronization elements need not be included in the critical path.
The present invention focuses on high circuit utilization by combining a floating point unit with a graphics unit and an integer unit. It does so by implementing super pipelining in a latchless dynamic logic family, which requires no additional logic levels for synchronization. It avoids super scalarity except where functional units are sufficiently different to justify not building a single consolidated unit. It allows operations to occur at their natural latency. In other words, individual instructions deliver their results as soon as their execution unit produces them, limiting the out-of-order nature of the design to where it occurs naturally. And, it simplifies operand bypass logic by having each subsequent stage of the pipeline pass previously generated results unaffected. Finally, by predicting complex operand conditions, operand availability, execution unit availability, and write ordering, it eliminates all non-calculating stall conditions, which eliminates the need for recirculation circuits.
Additionally, the present invention illustrates the need for locating a combination functional unit at a late stage within a pipeline. Late pipeline functional units have implementation cost and simplicity advantages over a traditional, early pipeline location. If properly designed, late pipeline functional units do not need to support partial or complete cancellation, and can avoid all pipeline stalls. A late pipeline functional unit can therefore be designed without recirculating hold paths such that it is free-running, e.g., once an operation is dispatched, it will proceed in a regular, predictable fashion. There are some costs to placing units late in a pipeline, but these are considered minor.
SUMMARY
The present invention comprises a method and apparatus for an enhanced floating point unit that supports floating point, integer, and graphics operations by combining the units into a single functional unit. The enhanced floating point unit comprises a register file coupled to a plurality of bypass multiplexers. Coupled to the bypass multiplexers are an aligner and a multiplier. And, coupled to the multiplier is an adder that further couples to a normalizer/rounder unit. The normalizer/rounder unit may comprise a normalizer and a rounder coupled in series and or a parallel normalizer/rounder. The enhanced floating point unit of the present invention supports both integer operations and graphics operations with one functional unit.
Additionally, the present invention comprises a method and apparatus for a pipeline of functional units with a late pipe functional unit that executes instructions without stalling until the result is available. The present invention comprises one or more earlier functional units coupled to a late pipe functional unit. The late pipe functional unit does not begin executing instructions until all of the input operands are or will be available for execution so that the late pipe functional unit will execute instructions without stalling until the result will be available in a fixed number of cycles. The present invention further comprises a late pipe functional unit that may comprise a floating point unit, a graphics unit, or an enhanced floating point unit. And finally, the late pipe functional unit is non-stalling and or is non-cancelable.

REFERENCES:
patent: 4879676 (1989-11-01), Hansen
patent: 5424734 (1995-06-01), Hirahara et al.
patent: 5633819 (1997-05-01), Brashears et al.
patent: 5768478 (1998-06-01), Batten, Jr.
patent: 5841298 (1998-11-01), Huang
patent: 6052770 (2000-04-01), Fant
patent: 6069497 (2000-05-01), Blomgren et al.
patent: 6118304 (2000-09-01), Potter et al.

Affiliated with

Blomgren James S.

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Brooks Jeffrey S.

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Potter Terence M.

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Booth Matthew J.

Attorney

[ 0.00 ] – not rated yet Voters 0 Comments 0

Booth & Wright LLP

Law Firm

[ 0.00 ] – not rated yet Voters 0 Comments 0

Follansbee John A.

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Intrinsity, Inc.

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for a late pipeline enhanced floating... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for a late pipeline enhanced floating..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for a late pipeline enhanced floating... will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-2954819

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure