Electrical computers and digital processing systems: processing – Processing architecture – Microprocessor or multichip or multimodule processor having...
Reexamination Certificate
1998-01-14
2002-06-25
Maung, Zarni (Department: 2154)
Electrical computers and digital processing systems: processing
Processing architecture
Microprocessor or multichip or multimodule processor having...
C712S023000, C712S200000, C712S215000, C712S216000, C712S217000, C712S218000, C712S219000, C712S220000, C326S093000, C326S097000
Reexamination Certificate
active
06412061
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates to processor design, and more particularly to a dynamic pipeline for executing instructions where the number of stages of the pipeline is dynamically modified depending upon the instruction or operation being executed.
DESCRIPTION OF THE RELATED ART
Pipelining is used in microprocessors to improve performance, by overlapping multiple instructions in a pipeline structure to decrease overall execution time. Each instruction is broken down into one or more common elemental operations that are performed sequentially to complete that instruction. The pipeline structure is formed of a plurality of pipe segments or stages, where each stage performs one of the elemental operations. Thus the pipeline is similar to an assembly line where each of the elemental operations is performed in a corresponding stage of the pipeline. The instruction begins at one end of the pipeline and is completed at the other end. Each stage of the pipeline is separated by registers or latches, and thus a new instruction enters the first stage of the pipeline while one or more previous instructions are being executed within subsequent stages of the pipeline. In this manner, although the time required to execute each instruction is not changed substantially, the overall execution time for a plurality of instructions is decreased.
Previously, the design of pipelines generally conformed to a few simple rules. First, the number of stages in a pipeline was determined by the most complex instruction to be performed by the processor, i.e., the number of stages was fixed to that number of stages needed to perform the most complex instruction of the processor. Thus, each instruction propagated through a fixed number of stages of the pipeline, regardless of how simple or complex that instruction was. Also, each stage was executed in a single clock cycle, and thus the speed of the clock was based on the slowest stage of the pipeline. With each edge of the clock signal, the data associated with an instruction was advanced to the next stage to perform the next elemental operation.
Pipelining has been a useful technique for improving the performance of processors for many applications. A processor using RISC (reduced instruction-set computer) principles is a prime candidate for a pipelined architecture. In a RISC processor, the instruction set is generally limited to a small number of simple functions, and thus the pipeline can be optimized to execute each of the simple instructions very quickly. Pipelining is also advantageous for use in graphics processors for the same reason. A graphics processor uses a relatively small instruction set to perform a variety of graphic data transfer operations and to execute a plurality of graphics equations. Although the present invention is not limited to any particular processor application, the preferred embodiment described below is incorporated into a graphics processor, and thus background on graphics processors is deemed appropriate.
The advent of substantial hardware improvements combined with standardized graphics languages has allowed the use of complex graphics functions in even the most common applications. For example, word processors, spreadsheets, and desktop publishing packages are now beginning to take full advantage of the improvements in graphics capabilities to improve the user interface. Although sophisticated graphics packages have been available for computer aided drafting, design and simulation for years, three dimensional graphic displays are now common in games, animation and multimedia communication designed for personal computers.
The architecture of the personal computer system has advanced to handle the sophisticated graphic capabilities required by modern software applications. In the simplest of designs, a single CPU handled all data functions, including graphics functions. In more complicated architectures, a separate graphics processor is provided to perform all graphic functions in order to relieve the primary CPU of this duty and to free up the CPU to perform other operations. Generally, the graphics processor is connected between a computer system bus and the video or frame buffer. The frame buffer is the memory which stores the video data that is actually displayed on the video screen. A video controller is connected to the frame buffer to convert the digital rasterized data from the frame buffer to the analog signals needed by the display device. In other more sophisticated architectures, the frame buffer is directly connected to the system bus, either separately or as part of the main memory, and thus the main CPU as well as the graphics processor can access the frame buffer memory across the system bus.
A graphics processor generally performs data transfer operations and functions for drawing points, lines, polylines, text, string text, triangles, and polygons to the frame buffer. Furthermore, the graphics processor performs many graphics functions on the data within the frame buffer, such as patterning, depth cueing, color compare, alpha blending, accumulation, texture assist, anti-aliasing, supersampling, color masking, stenciling, panning and zooming, error correction, as well as depth and color interpolation, among other functions.
It is evident that the demand for greater graphic capabilities have increased dramatically, and that computer architectures have been improved to partially meet these demands. Also, graphics processors must be capable of performing more sophisticated functions in less amount of time in order to process the increasingly greater amounts of graphical data required by modern software applications. Although graphics processors typically use a pipelined architecture to improve speed and performance, the ever increasing demand for more sophisticated operations has required a greater amount of time for a given stage to execute, thereby reducing performance. As processing demands increase, there is a greater need for a processor with the capability to perform more sophisticated functions in a shorter amount of time. Therefore, there is a need for improved pipelining architectures to increase processor performance, both for graphics processors and for general purpose microprocessors.
SUMMARY OF THE INVENTION
In a processor incorporating a dynamic pipeline according to the present invention, the number of stages of the pipeline is varied depending upon the complexity of the instruction being performed. The dynamic pipeline includes a set of latches to separate the stages of the pipeline. The dynamic pipeline also includes a plurality of multiplexers which dynamically alter the data path to bypass corresponding latches based on the instruction. In this manner, the number of stages is reduced for simpler instructions, i.e., the pipeline is collapsed to perform the simpler instructions in less clock cycles. Therefore, collapsing the number of stages of the pipeline to perform the simpler instructions with less stages results in increased speed and performance of the processor. The maximum number of stages is used for more complex operations, such as alpha-blending in a graphics application processor, while less stages are used for simpler operations.
In the preferred embodiment, a circuit provides data to a first latch, which provides the latched data to a first operation element. The first operation element is preferably a multiplier for alpha blending. A data selector, which is preferably a multiplexer (mux), selects between the data from the circuit or the output of the first operation element and provides an output to a second latch. The second latch provides data to a second operation element. Control logic receives the instruction currently being executed and controls the data selector based on the instruction. In this manner, depending on the instruction currently being executed, the data selector can collapse the pipeline by bypassing the first latch and the multiplier.
The first and second latches are preferably formed of two aligned latches. Thus, the second latch may
Cirrus Logic Inc.
El-Hady Nabil
Maung Zarni
Murphy, Esq. James J.
Winstead Sechrest & Minick
LandOfFree
Dynamic pipelines with reusable logic elements controlled by... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Dynamic pipelines with reusable logic elements controlled by..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Dynamic pipelines with reusable logic elements controlled by... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2938111