Method and apparatus for jump control in a pipelined processor

Computer-aided design and analysis of circuits and semiconductor – Nanotechnology related integrated circuit design

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C716S030000, C716S030000

Reexamination Certificate

active

06560754

ABSTRACT:

COPYRIGHT
A portion of the disclosure of this patent document contains material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent files or records, but otherwise reserves all copyright rights whatsoever.
RELATED APPLICATIONS
This application is related to co-pending U.S. patent application Ser. No. 09/523,877 filed Mar. 13, 2000 and entitled “Method and Apparatus for Jump Delay Slot Control in a Pipelined Processor”, U.S. patent application Ser. No. 09/524,179 filed Mar. 13, 2000 and entitled “Method and Apparatus for Processor Pipeline Segmentation and Re-Assembly”, U.S. patent application Ser. No. 09/524,178 filed Mar. 13, 2000 and entitled “Method and Apparatus for Loose Register Encoding Within a Pipelined Processor”, and U.S. patent application Ser. No. 09/418,663 filed Oct. 14, 1999, entitled “Method and Apparatus for Managing the Configuration and Functionality of a Semiconductor Design”.
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to the field of integrated circuit design, specifically to the use of a hardware description language (RDL) for implementing instructions in a pipelined central processing unit (CPU) or user-customizable microprocessor.
2. Description of Related Technology
RISC (or reduced instruction set computer) processors are well known in the computing arts. RISC processors generally have the fundamental characteristic of utilizing a substantially reduced instruction set as compared to non-RISC (commonly known as “CISC”) processors. Typically, RISC processor machine instructions are not all micro-coded, but rather may be executed immediately without decoding, thereby affording significant economies in terms of processing speed. This “streamlined” instruction handling capability furthermore allows greater simplicity in the design of the processor (as compared to non-RISC devices), thereby allowing smaller silicon and reduced cost of fabrication.
RISC processors are also typically characterized by (i) load/store memory architecture (i.e., only the load and store instructions have access to memory; other instructions operate via internal registers within the processor); (ii) unity of processor and compiler; and (iii) pipelining.
Pipelining is a technique for increasing the performance of processor by dividing the sequence of operations within the processor into segments which are effectively executed in parallel when possible. In the typical pipelined processor, the arithmetic units associated with processor arithmetic operations (such as ADD, MULTIPLY, DIVIDE, etc.) are usually “segmented”, so that a specific portion of the operation is performed in a given segment of the unit during any clock cycle.
FIG. 1
illustrates a typical processor architecture having such segmented arithmetic units. Hence, these units can operate on the results of a different calculation at any given clock cycle. As an example, in the first clock cycle two numbers A and B are fed to the multiplier unit
10
and partially processed by the first segment
12
of the unit. In the second clock cycle, the partial results from multiplying A and B are passed to the second segment
14
while the first segment
12
receives two new numbers (say C and D) to start processing. The net result is that after an initial startup period, one multiplication operation is performed by the arithmetic unit
10
every clock cycle.
The depth of the pipeline may vary from one architecture to another. In the present context, the term “depth” refers to the number of discrete stages present in the pipeline. In general, a pipeline with more stages executes programs faster but may be more difficult to program if the pipeline effects are visible to the programmer. Most pipelined processors are either three stage (instruction fetch, decode, and execute) or four stages (such as instruction fetch, decode, operand fetch, and execute, or alternatively instruction fetch, decode/operand fetch, execute, and writeback), although more or less stages may be used.
When developing the instruction set of a pipelined processor, several different types of “hazards” must be considered. For example, so called “structural” or “resource contention” hazards arise from overlapping instructions competing for the same resources (such as busses, registers, or other functional units) which are typically resolved using one or more pipeline stalls. So-called “data” pipeline hazards occur in the case of read/write conflicts which may change the order of memory or register accesses. “Control” hazards are generally produced by branches or similar changes in program flow.
Interlocks are generally necessary with pipelined architectures to address many of these hazards. For example, consider the case where a following instruction (n+1) in an earlier pipeline stage needs the result of the instruction n from a later stage. A simple solution to the aforementioned problem is to delay the operand calculation in the instruction decoding phase by one or more clock cycles. A result of such delay, however is that the execution time of a given instruction on the processor is in part determined by the instructions surrounding it within the pipeline. This complicates optimization of the code for the processor, since it is often difficult for the programmer to spot interlock situations within the code.
“Scoreboarding” may be used in the processor to implement interlocks; in this approach, a bit is attached to each processor register to act as an indicator of the register content; specifically, whether (i) the contents of the register have been updated and are therefore ready for use, or (ii) the contents are undergoing modification such as being written to by another process. This scoreboard is also used in some architectures to generate interlocks which prevent instructions which are dependent upon the contents of the scoreboarded register from executing until the scoreboard indicates that the register is ready. This type of approach is referred to as “hardware” interlocking, since the interlock is invoked purely through examination of the scoreboard via hardware within the processor. Such interlocks generate “stalls” which preclude the data dependent instruction from executing (thereby stalling the pipeline) until the register is ready.
Alternatively, NOPs (no-operation opcodes) may be inserted in the code so as to delay the appropriate pipeline stage when desired. This later approach has been referred to as “software” interlocking, and has the disadvantage of increasing the code size and complexity of programs that employ instructions that require interlocking. Heavily software interlocked designs also tend not to be fully optimized in terms of their code structures.
Another important consideration in processor design is program branching or “jumps”. All processors support some type of branching instructions. Simply stated, branching refers to the condition where program flow is interrupted or altered. Other operations such as loop setup and subroutine call instructions also interrupt or alter program flow in a similar fashion. The term “jump delay slot” is often used to refer to the slot within a pipeline subsequent to a branching or jump instruction being decoded. The instruction after the branch (or load) is executed while awaiting completion of the branch/load instruction. Branching may be conditional (i.e., based on the truth or value of one or more parameters) or unconditional. It may also be absolute (e.g., based on an absolute memory address), or relative (e.g., based on relative addresses and independent of any particular memory address).
Branching can have a profound effect on pipelined systems. By the time a branch instruction is inserted and decoded by the processor's instruction decode stage (indicating that the processor must begin executing a different address), the next instruction word in the instruction sequence has b

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for jump control in a pipelined processor does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus for jump control in a pipelined processor, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for jump control in a pipelined processor will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3072417

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.