Fixed shift amount variable length instruction stream...

Electrical computers and digital processing systems: processing – Instruction decoding – Decoding instruction to accommodate variable length...

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C712S204000, C712S213000

Reexamination Certificate

active

06260134

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to superscalar microprocessors and, more particularly, to the predecoding of variable byte-length computer instructions within high performance and high frequency superscalar microprocessors.
2. Description of the Relevant Art
Superscalar microprocessors are capable of attaining performance characteristics which surpass those of conventional scalar processors by allowing the concurrent execution of multiple instructions. Due to the widespread acceptance of the ×86 family of microprocessors, efforts have been undertaken by microprocessor manufacturers to develop superscalar microprocessors that execute ×86 instructions. Such superscalar microprocessors achieve relatively high performance characteristics while advantageously maintaining backwards compatibility with the vast amount of existing software developed for previous microprocessor generations such as the 8086, 80286, 80386, 80486, Pentium™, K5™, Pentium II™, and K6™.
The ×86 instruction set is relatively complex and is characterized by a plurality of variable byte-length instructions. A generic format illustrative of the ×86 instruction set is shown in FIG.
1
A. As illustrated in the figure, an ×86 instruction consists of from one to fourteen optional prefix bytes
102
, followed by an operation code (opcode) field
104
, an optional addressing mode (ModR/M) byte
106
, an optional scale-index-base (SIB) byte
108
, an optional displacement field
110
, and an optional immediate data field
112
.
The opcode field
104
defines the basic operation for a particular instruction. The default operation of a particular opcode may be modified by one or more prefix bytes. For example, a prefix byte may be used to change the address or operand size for an instruction, to override the default segment used in memory addressing, or to instruct the processor to repeat a string operation a number of times. The opcode field
104
follows the prefix bytes
102
, if any, and may be one or two bytes in length. The addressing mode (ModR/M) byte
106
specifies the registers used as well as memory addressing modes. The register field of the ModR/M byte alternatively may be used as an opcode extension, or sub-opcode. The scale-index-base (SIB) byte
108
is used only in 32-bit base-relative addressing using scale and index factors. A base field of the SIB byte specifies which register contains the base value for the address calculation, and an index field specifies which register contains the index value. A scale field specifies the power of two by which the index value will be multiplied before being added, along with any displacement, to the base value. The next instruction field is the optional displacement field
110
, which may be from one to four bytes in length. The displacement field
110
contains a constant used in address calculations. The optional immediate field
112
, which may also be from one to six bytes in length, contains a constant used as an instruction operand. The 80286 sets a maximum length for an instruction at 10 bytes, while the 80386, 80486, Pentium™, K5™, Pentium II™, and K6™ allow instruction lengths of up to 15 bytes.
Referring now to
FIG. 1B
, several different variable byte-length ×86 instruction formats are shown. The shortest ×86 instruction is only one byte long, and comprises a single opcode byte as shown in format (a). For certain instructions, the byte containing the opcode field also contains a register field as shown in formats (b), (c) and (e). Format
0
) shows an instruction with two opcode bytes. An optional ModR/M byte follows opcode bytes in formats (d), (f), (h), and (
0
). Immediate data follows opcode bytes in formats (e), (g), (i), and (k), and follows a ModR/M byte in formats (f) and (h).
FIG. 1C
illustrates several possible addressing mode formats (a)-(h). Formats (c), (d), (e), (g), and (h) contain ModR/M bytes with offset (i.e., displacement) information. An SIB byte is used in formats (f), (g), and (h).
The complexity of the ×86 instruction set poses difficulties in implementing high performance ×86 compatible superscalar microprocessors. One difficulty arises from the fact that instructions must be aligned with respect to the parallel-coupled instruction decoders of such processors before proper decode can be effectuated. In contrast to most RISC instruction formats, since the ×86 instruction set consists of variable byte-length instructions, the start bytes of successive instructions within a line are not necessarily equally spaced, and the number of instructions per line is not fixed. As a result, employment of simple, fixed-length shifting logic cannot in itself solve the problem of instruction alignment.
Superscalar microprocessors have been proposed that employ instruction predecoding techniques to help solve the problem of quickly aligning, decoding and executing a plurality of variable byte-length instructions in parallel. In one such superscalar microprocessor, when instructions are written within the instruction cache from an external main memory, a predecoder appends several predecode bits (referred to collectively as a predecode tag) to each byte. These bits may indicate whether the byte is the start and/or end byte of an ×86 instruction, the number of microinstructions required to implement the ×86 instruction, and the location of opcodes and prefixes.
Unfortunately, predecode units experience the same difficulties in aligning instructions as decode units. In one implementation, a predecode unit attempts to predecode one instruction per clock cycle. A multiplexer routes instruction bytes to the predecode unit, which determines the length of the instruction and routes the instruction length to the multiplexer, which routes the bytes subsequent to the previously predecoded instruction to the predecode unit to be predecoded in the next clock cycle. Because the instruction length is variable, the multiplexer must be able to shift the instruction bytes from 1 to 15 bytes, which increases the complexity of the multiplexer. Further, the time to detect the length of the instruction, route the instruction length to the multiplexer, and shift the instruction bytes by the appropriate number of positions is a time consuming operation that may limit the performance of the predecode unit and consequently limit the performance of the microprocessor.
SUMMARY OF THE INVENTION
The problems outlined above are in large part solved by a predecode unit configured to predecode a fixed number of instruction bytes of variable length instructions per clock cycle. The predecode unit outputs predecode bits which identify whether any of the predecoded instruction bytes are the start byte of an instruction. An instruction alignment unit then uses the start bits to dispatch the variable byte-length instructions to a plurality of decode units that form fixed issue positions within a processor. By predecoding a fixed number of instruction bytes per clock cycle, the multiplexer that routes instruction bytes to the predecode unit shifts instructions bytes by a fixed number, which greatly simplifies the multiplexer. Furthermore, the multiplexing operation may be performed in parallel with the predecode operation because the number of byte positions by which the multiplexer shifts the instruction bytes is independent of the predecode operation. Both of these features accommodate very high frequencies of operation.
In one embodiment, the predecode unit identifies a plurality of length vectors. Each length vector is associated with one of the instruction bytes predecoded in a clock cycle. The length vector identifies the length of an instruction if an instruction starts at the instruction byte corresponding to the length vector. A tree circuit determines in which instruction bytes instructions start. The length vector corresponding to the instruction byte in which an instruction starts identifies the instruction byte in which a subsequent instruction starts (i.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Fixed shift amount variable length instruction stream... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Fixed shift amount variable length instruction stream..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Fixed shift amount variable length instruction stream... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2436324

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.