Electrical computers and digital processing systems: processing – Instruction alignment
Reexamination Certificate
1999-06-01
2001-03-13
Treat, William M. (Department: 2783)
Electrical computers and digital processing systems: processing
Instruction alignment
Reexamination Certificate
active
06202142
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to the field of superscalar microprocessors and, more particularly, to instruction dispatch mechanisms within superscalar microprocessors.
2. Description of the Relevant Art
Superscalar microprocessors are capable of attaining performance characteristics which surpass those of conventional scalar processors by allowing the concurrent execution of multiple instructions. Due to the widespread acceptance of the x86 family of microprocessors, efforts have been undertaken by microprocessor manufacturers to develop superscalar microprocessors which execute x86 instructions. Such superscalar microprocessors achieve relatively high performance characteristics while advantageously maintaining backwards compatibility with the vast amount of existing software developed for previous microprocessor generations such as the 8086, 80286, 80386, and 80486.
The x86 instruction set is relatively complex and is characterized by a plurality of variable byte length instructions. An x86 instruction includes from one to five optional prefix bytes followed by an operation code (opcode) field, an optional addressing mode (Mod R/M) byte, an optional scale-index-base (SIB) byte, an optional displacement field, and an optional immediate data field.
The opcode field defines the basic operation for a particular instruction. The default operation of a particular opcode may be modified by one or more prefix bytes. For example, a prefix byte may be used to change the address or operand size for an instruction, to override the default segment used in memory addressing, or to instruct the processor to repeat a string operation a number of times. The opcode field may be one or two bytes in length. The addressing mode (Mod R/M) byte specifies the registers used as well as memory addressing modes used by the instruction. The SIB byte is used only in 32-bit base-relative addressing using scale and index factors. A base field of the SIB byte specifies which register contains the base value for the address calculation, and an index field specifies which register contains the index value. A scale field specifies the power of two by which the index value will be multiplied before being added, along with any displacement, to the base value. The next instruction field is the optional displacement field, which may be from one to four bytes in length. The displacement field contains a constant used in address calculations. The optional immediate field, which may also be from one to four bytes in length, contains a constant used as an instruction operand. The shortest x86 instructions are only one byte long, and comprise a single opcode byte. The 80286 sets a maximum length for an instruction at 10 bytes, while the 80386 and 80486 both allow instruction lengths of up to 15 bytes.
The complexity of the x86 instruction set poses difficulties in implementing high performance x86 compatible superscalar microprocessors. One difficulty arises from the fact that instructions must be aligned with respect to the parallel-coupled instruction decoders of such processors before proper decode can be effectuated. In contrast to most RISC instruction formats, the x86 instruction set consists of variable byte length instructions. The variable byte length nature implies that the start bytes of successive instructions within a line are not necessarily equally spaced, and the number of instructions per line is not fixed. As a result, employment of simple, fixed-length shifting logic cannot in itself solve the problem of instruction alignment. Although scanning logic has been proposed to dynamically find the boundaries of instructions during the decode pipeline stage (or stages) of the processor, such a solution typically requires that the decode pipeline stage of the processor be implemented with a relatively large number of cascaded levels of logic gates and/or the allocation of several clock cycles to perform the scanning operation.
Another problem related to the detection of variable byte length instructions is incurred by microprocessors which define certain complex instructions as microcode instructions. “Microcode instructions”, as used herein, are instructions which are not directly decoded by the parallel-coupled instruction decoders of the superscalar microprocessor. Instead, microcode instructions are routed to a microcode unit which decomposes the microcode instructions into simpler operations which may be decoded by the parallel-coupled instruction decoders. The microcode instructions, therefore, must be detected prior to decode of the instructions and routed to the microcode unit.
Certain microprocessors may employ predecoding as a method for locating variable byte length instructions. However, particularly if a cache line may be partially predecoded, the predecode data may be invalid for a given cache line fetched for dispatch within the microprocessor. A method for validating the predecode data is therefore needed.
SUMMARY OF THE INVENTION
The problems outlined above are in large part solved by a microprocessor employing an instruction scanning unit in accordance with the present invention. The microprocessor employs predecoding, in which predecode information is generated for a set of instruction bytes prior to storing the instruction bytes into an instruction cache. In particular, the start and end of instructions are indicated. Additionally, a set of functional bits are defined which indicate the opcode byte of the instruction as well as the microcode
on-microcode nature of each instruction, among other things. When the instruction are fetched, the corresponding predecode data is fetched as well. The instruction scanning unit receives the predecode data, and scans the predecode data to locate the beginning and end of each instruction. The predecode data is independently scanned within multiple regions of the set of bytes, thereby increasing the number of instructions which may be located in a given clock cycle.
The instruction scanning unit speculatively generates instruction valid masks based upon the predecode data defining the start of instructions. A mask is generated for each byte within a particular region, assuming that that byte is an end byte of an instruction. In parallel, the predecode data defining the ends of instructions is scanned. The number of instructions ending prior to each byte in the region is counted. Subsequently, certain ones of the instruction valid masks are selected via the instruction end counts and the predecode data defining the end of instructions. If a byte is the end of an instruction and there are no instructions ending prior to that byte within the region, then the instruction valid mask corresponding to that byte is selected as the first instruction from the region. Similarly, if a second byte is the end of an instruction and there is one instruction ending prior to that second byte within the region, then the instruction valid mask corresponding to that second byte is selected as the second instruction from the region, etc. By processing the start and end predecode data separately, a faster scanning of the predecode data may be realized. The instructions identified by the instruction scanning unit are selected for dispatch into the instruction processing pipeline of the microprocessor.
In parallel with scanning the predecode data to identify instructions for dispatch into the instruction processing pipeline, the instruction scanning unit scans the predecode data to locate microcode instructions within the set of instruction bytes. Microcode instructions so identified are dispatched to a microcode unit as well as into the instruction processing pipeline of the microprocessor. By identifying the microcode instructions during instruction scanning, the microcode unit may begin processing the microcode instructions earlier in the instruction processing pipeline. The execution time of the microcode instructions may thereby be improved over microprocessors which identify microcode instructions at a later point in the
Narayan Rammohan
Southard Shane A.
Tran Thang M.
Advanced Micro Devices , Inc.
Conley Rose & Tayon PC
Merkel Lawrence J.
Treat William M.
LandOfFree
Microcode scan unit for scanning microcode instructions... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Microcode scan unit for scanning microcode instructions..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Microcode scan unit for scanning microcode instructions... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2533443