Electrical computers and digital processing systems: processing – Instruction alignment
Reexamination Certificate
1999-10-14
2003-04-08
Kim, Kenneth S. (Department: 2183)
Electrical computers and digital processing systems: processing
Instruction alignment
C712S210000, C712S213000, C712S237000
Reexamination Certificate
active
06546478
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention is related to the field of processors and, more particularly, to instruction fetching mechanisms within processors.
2. Description of the Related Art
Superscalar processors achieve high performance by executing multiple instructions per clock cycle and by choosing the shortest possible clock cycle consistent with the design. As used herein, the term “clock cycle” refers to an interval of time accorded to various stages of an instruction processing pipeline within the processor. Storage devices (e.g. registers and arrays) capture their values according to the clock cycle. For example, a storage device may capture a value according to a rising or falling edge of a clock signal defining the clock cycle. The storage device then stores the value until the subsequent rising or falling edge of the clock signal, respectively. The term “instruction processing pipeline” is used herein to refer to the logic circuits employed to process instructions in a pipelined fashion. Although the pipeline may be divided into any number of stages at which portions of instruction processing are performed, instruction processing generally comprises fetching the instruction, decoding the instruction, executing the instruction, and storing the execution results in the destination identified by the instruction.
A popular instruction set architecture is the x86 instruction set architecture. Due to the widespread acceptance of the x86 instruction set architecture in the computer industry, superscalar processors designed in accordance with this architecture are becoming increasingly common. The x86 instruction set architecture specifies a variable byte-length instruction set in which different instructions may occupy differing numbers of bytes. For example, the 80386 and 80486 processors allow a particular instruction to occupy a number of bytes between 1 and 15. The number of bytes occupied depends upon the particular instruction as well as various addressing mode options for the instruction.
Because instructions are variable-length, locating instruction boundaries is complicated. The length of a first instruction must be determined prior to locating a second instruction subsequent to the first instruction within an instruction stream. However, the ability to locate multiple instructions within an instruction stream during a particular clock cycle is crucial to superscalar processor operation. As operating frequencies increase (i.e. as clock cycles shorten), it becomes increasingly difficult to locate multiple instructions simultaneously.
Various predecode schemes have been proposed in which a predecoder appends information regarding each instruction byte to the instruction byte as the instruction is stored into the cache. As used herein, the term “predecoding” is used to refer to generating instruction decode information prior to storing the corresponding instruction bytes into an instruction cache of a processor. The generated information may be stored with the instruction bytes in the instruction cache. For example, an instruction byte may be indicated to be the beginning or end of an instruction. By scanning the predecode information when the corresponding instruction bytes are fetched, instructions may be located without actually attempting to decode the instruction bytes. The predecode information may be used to decrease the amount of logic needed to locate multiple variable-length instructions simultaneously. Unfortunately, these schemes become insufficient at high clock frequencies as well. A method for locating multiple instructions during a clock cycle at high frequencies is needed.
SUMMARY OF THE INVENTION
The problems outlined above are in large part solved by a line predictor as described herein. The line predictor caches alignment information for instructions. In response to each fetch address, the line predictor provides alignment information for the instruction beginning at the fetch address, as well as one or more additional instructions subsequent to that instruction. The alignment information may be, for example, instruction pointers, each of which directly locates a corresponding instruction within a plurality of instruction bytes fetched in response to the fetch address. Since instructions are located by the pointers, the alignment of instructions to decode units may be a low latency, high frequency operation. Rather than having to scan predecode data stored on a byte by byte basis, the alignment information is stored on an instruction basis based on fetch address. In this manner, instructions may be more easily extracted from the fetched instruction bytes.
The line predictor may include a memory having multiple entries, each entry storing up to a predefined maximum number of instruction pointers and a fetch address corresponding to the instruction identified by a first one of the instruction pointers. Furthermore, each entry may store additional information regarding the terminating instruction within the entry. The information may be used to aid in the processing of the instructions within the entry. In one embodiment, the additional information includes an indication of the branch displacement when the terminating instruction is a branch instruction. The branch displacement may be rapidly located and used to generate a branch target address. In another embodiment, the additional information includes the entry point for a microcode instruction when the terminating instruction is a microcode instruction. Furthermore, the microcode instruction may be identified by an instruction pointer corresponding to a particular decode unit which is coupled to the microcode unit. In this manner, the hardware for supporting microcode instructions within the decode units may be reduced.
Broadly speaking, a processor is contemplated. The processor comprises a fetch address generation unit configured to generate a fetch address and a line predictor coupled to the fetch address generation unit. The line predictor includes a first memory comprising a plurality of entries, each entry storing a plurality of instruction pointers and control information. The line predictor is configured to select a first entry (of the plurality of entries) corresponding to the fetch address. Each of a first plurality of instruction pointers within the first entry, if valid, directly locates an instruction within a plurality of instruction bytes fetched in response to the fetch address. The control information identifies a type of a last one of a plurality of instructions located by the plurality of instruction pointers. Additionally, a computer system is contemplated including the processor and an input/output (I/O) device configured to communicate between the computer system and another computer system to which the I/O device is couplable.
Moreover, a method is contemplated. A fetch address is generated. A plurality of instruction pointers and control information are selected from a line predictor. Each of the plurality of instruction pointers, if valid, directly locates an instruction within a plurality of instruction bytes fetched in response to the fetch address. The control information identifying a type of a last one of a plurality of instructions identified by the plurality of instruction pointers.
REFERENCES:
patent: 4437149 (1984-03-01), Pomerene et al.
patent: 4860197 (1989-08-01), Langendorf et al.
patent: 5101341 (1992-03-01), Cirello et al.
patent: 5337415 (1994-08-01), DeLano et al.
patent: 5353419 (1994-10-01), Touch et al.
patent: 5434985 (1995-07-01), Emma et al.
patent: 5442760 (1995-08-01), Rustad et al.
patent: 5488710 (1996-01-01), Sato et al.
patent: 5506976 (1996-04-01), Jaggar
patent: 5513330 (1996-04-01), Stiles
patent: 5535347 (1996-07-01), Grochowski et al.
patent: 5584001 (1996-12-01), Hoyt et al.
patent: 5586276 (1996-12-01), Grochowski et al.
patent: 5625787 (1997-04-01), Mahin et al.
patent: 5630082 (1997-05-01), Yao et al.
patent: 5669011 (1997-09-01), Alpert et al.
patent: 5758114 (1998-05-01), Johnson et al.
patent: 57908
Keller James B.
Matus Francis M.
Schakel Keith R.
Sharma Puneet
Advanced Micro Devices , Inc.
Kim Kenneth S.
Merkel Lawrence J.
Meyertons Hood Kivlin Kowert & Goetzel P.C.
LandOfFree
Line predictor entry with location pointers and control... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Line predictor entry with location pointers and control..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Line predictor entry with location pointers and control... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3057268