Electrical computers and digital processing systems: processing – Processing control – Detecting end or completion of microprogram
Reexamination Certificate
1999-10-27
2001-02-20
Treat, William M. (Department: 2783)
Electrical computers and digital processing systems: processing
Processing control
Detecting end or completion of microprogram
C712S207000, C712S213000
Reexamination Certificate
active
06192468
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to the field of microprocessors and, more particularly, to microcode instruction mechanisms within microprocessors.
2. Description of the Relevant Art
Superscalar microprocessors achieve high performance by executing multiple instructions per clock cycle and by choosing the shortest possible clock cycle consistent with the design. As used herein, the term “clock cycle” refers to an interval of time accorded to various stages of an instruction processing pipeline within the microprocessor. Storage devices (e.g. registers and arrays) capture their values according to the clock cycle. For example, a storage device may capture a value according to a rising or falling edge of a clock signal defining the clock cycle. The storage device then stores the value until the subsequent rising or falling edge of the clock signal, respectively. The term “instruction processing pipeline” is used herein to refer to the logic circuits employed to process instructions in a pipelined fashion. Although the pipeline may be divided into any number of stages at which portions of instruction processing are performed, instruction processing generally comprises fetching the instruction, decoding the instruction, executing the instruction, and storing the execution results in the destination identified by the instruction.
Microprocessor designers often design their products in accordance with the x86 microprocessor architecture in order to take advantage of its widespread acceptance in the computer industry. Because the x86 microprocessor architecture is pervasive, many computer programs are written in accordance with the architecture. X86 compatible microprocessors may execute these computer programs, thereby becoming more attractive to computer system designers who desire x86-capable computer systems. Such computer systems are often well received within the industry due to the wide range of available computer programs.
The x86 microprocessor architecture specifies a variable length instruction set (i.e. an instruction set in which various instructions employ differing numbers of bytes to specify that instruction). For example, the 80386 and later versions of x86 microprocessors employ between 1 and 15 bytes to specify a particular instruction. Instructions have an opcode, which may be 1-2 bytes, and additional bytes may be added to specify addressing modes, operands, and additional details regarding the instruction to be executed. Certain instructions within the x86 instruction set are quite complex, specifying multiple operations to be performed. For example, the PUSHA instruction specifies that each of the x86 registers be pushed onto a stack defined by the value in the ESP register. The corresponding operations are a store operation for each register, and decrements of the ESP register between each store operation to generate the address for the next store operation.
Often, complex instructions are classified as microcode instructions. Microcode instructions are transmitted to a microcode unit within the microprocessor, which decodes the complex microcode instruction and produces two or more simpler instructions for execution by the microprocessor. The simpler instructions corresponding to the microcode instruction are typically stored in a read-only memory (ROM) within the microcode unit. The microcode unit determines an address within the ROM at which the simpler instructions are stored, and transfers the instructions out of the ROM beginning at that address. Multiple clock cycles may be used to transfer the entire set of instructions within the ROM that correspond to the microcode instruction. Different instructions may differing numbers of simpler instructions to effectuate their corresponding functions. Additionally, the number of simpler instructions corresponding to a particular microcode instruction may vary according to the addressing mode of the instruction, the operand values, and/or the options included with the instruction. The microcode unit issues the simpler instructions into the instruction processing pipeline of the microprocessor. The simpler instructions are thereafter executed in a similar fashion to other instructions. It is noted that the simpler instructions may be instructions defined within the instruction set, or may be custom instructions defined for the particular microprocessor.
Conversely, less complex instructions are decoded by hardware decode units within the microprocessor, without intervention by the microcode unit. The terms “directly-decoded instruction” and “fastpath instruction” will be used herein to refer to instructions which are decoded and executed by the microprocessor without the aid of a microcode unit. As opposed to microcode instructions which are reduced to simpler instructions which may be handled by the microprocessor, directly-decoded instructions are decoded and executed via hardware decode and functional units included within the microprocessor.
Unfortunately, translating microcode instructions into an arbitrary number of simpler instructions creates numerous problems. Because the microcode instructions may contain branch instructions, the next address of the microcode instruction is generated after a microcode instruction is read. In high frequency microprocessors, the time to read a microcode instruction, and generate the address of the next microcode instruction may exceed the period of a clock cycle. This can create stalls in instruction dispatch and thereby reduce the throughput of the microprocessor. Additionally, if multiple microcode instructions are stored in one microcode line, the last microcode line may not contain as many microcode instructions as functional units. In this situation, functional units are not utilized in the last clock cycle, and processor throughput is reduced.
SUMMARY OF THE INVENTION
The problems outlined above are in large part solved by an MROM instruction unit in accordance with the present invention. The MROM instruction unit stores the next microcode instruction address in a sequence control field appended to a previous instruction. Therefore, the next microcode instruction address is available one cycle earlier then the microcode instruction. This allows the next address to be generated in parallel with accessing the microcode instruction and thereby reduces the time delay to generate the next address and access the next microcode instruction. Additionally, the sequence control field indicates that the subsequent line is the last line in a microcode sequence and the number of microcode instructions in that line. This information is used to pack additional instruction(s) after the microcode instruction(s) to thereby fully utilize the functional units.
Broadly speaking, the present invention contemplates a microcode instruction unit including an address generator and a storage device. The address generator generates an address of a first microcode instruction of a microcode sequence. The storage device is coupled to the address generator and includes a plurality of lines for storing a plurality of microcode instructions and a plurality of control fields. The plurality of control fields are associated with the plurality of microcode instructions. The control fields are configured to identify a subsequent microcode instruction in the microcode sequence, and the control fields are appended to microcode instructions previous in the microcode sequence to the microcode instructions associated with the control fields.
The present invention further contemplates a method of controlling a microcode instruction unit comprising the steps of accessing a first microcode line and a first control field from a storage device and decoding the first control field to determine if a subsequent microcode line includes a branch instruction. If the subsequent microcode line includes a branch instruction, a first branch address is generated and stored. If the subsequent microcode line does not include a branch instruction, an address is incremented and store
Mahalingaiah Rupaka
Miller Paul K.
Advanced Micro Devices , Inc.
Conley Rose & Tayon PC
Merkel Lawrence J.
Treat William M.
LandOfFree
Apparatus and method for detecting microbranches early does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Apparatus and method for detecting microbranches early, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Apparatus and method for detecting microbranches early will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2571826