Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories
Reexamination Certificate
2003-03-04
2004-04-13
Sparks, Donald (Department: 2187)
Electrical computers and digital processing systems: memory
Storage accessing and control
Hierarchical memories
C711S125000, C711S137000, C711S141000, C711S213000, C711S220000
Reexamination Certificate
active
06721849
ABSTRACT:
BACKGROUND
The present invention relates to an instruction synchronization scheme in a processing agent.
Instruction decoding can involve many different processes. For the purposes of this discussion, two different processes shall be distinguished from one another. “Instruction synchronization” refers to the act of identifying the locations of instructions within a string of instruction data. As is known, many processors operate upon variable-length instructions. The length of instructions from the Intel x86 instruction set, for example, may be from one to fifteen bytes. The instructions are often byte-aligned within a memory. A processor typically determines the location at which a first instruction begins and determines the location of other instructions iteratively, by determining the length of a current instruction and identifying the start of a subsequent instruction at the next byte following the conclusion of the current instruction. Within the processor, a “pre-decoder” may perform instruction synchronization. All other decoding operations, such as decoding of instruction type, registers and immediate values from instruction data, shall be referred to as “decoding” herein, to be performed by a “decoder.”
FIG. 1
is a block diagram illustrating the process of program execution in a conventional processor. Program execution may include three stages: front end
110
, execution
120
and memory
130
. The front-end stage
110
performs instruction pre-processing. Front end processing is designed with the goal of supplying valid decoded instructions to an execution core with low latency and high bandwidth. Front-end processing can include instruction synchronization, decoding, branch prediction and renaming. As the name implies, the execution stage
120
performs instruction execution. The execution stage
120
typically communicates with a memory
130
to operate upon data stored therein.
Instruction synchronization is known per se. Typically, instruction synchronization is performed when instruction data is stored a memory in the front-end stage. Given an instruction pointer (“IP”), the front-end stage
110
may retrieve a predetermined length of data (called a “chunk” herein) that contains the instruction referenced by the IP. The instruction itself may be located at any position within the chunk. Instruction synchronization examines all data from the location of the referenced instruction to the end of the chunk and identifies instructions therein. When the chunk is stored in a memory in the front-end stage, instruction markers also may be stored in the memory to identify the position of the instructions for later use.
Prior instruction synchronization schemes suffer from some performance drawbacks. First, instruction synchronization adds latency because the process must be performed on all data from the requested instruction to the end of the chunk before the requested instruction may be used otherwise. The requested instruction is available to the execution stage
120
only after the delay introduced by the synchronization process. Second, instructions in a partially synchronized chunk may not be available even though they may be present in the front-end memory. A front-end memory may not hit on a request for an instruction in a non-synchronized portion of such a chunk. In response, although the front-end memory may store the requested instruction, the front end
110
may cause the chunk to be re-retrieved from another source and may perform instruction synchronization upon it.
Accordingly, there is a need in the art for an instruction synchronization scheme that avoids, unnecessary latency in the synchronization process.
REFERENCES:
patent: 5075840 (1991-12-01), Grohoski et al.
patent: 5499350 (1996-03-01), Uchida et al.
patent: 5642493 (1997-06-01), Burgess
patent: 5689672 (1997-11-01), Witt et al.
patent: 5761473 (1998-06-01), Kahle et al.
patent: 5764938 (1998-06-01), White et al.
patent: 5790822 (1998-08-01), Sheaffer et al.
patent: 5881260 (1999-03-01), Raje et al.
patent: 5918245 (1999-06-01), Yung
patent: 5923612 (1999-07-01), Park et al.
patent: 5930830 (1999-07-01), Mendelson et al.
patent: 5991863 (1999-11-01), Dao et al.
patent: 6073213 (2000-06-01), Peled et al.
patent: 6175930 (2001-01-01), Arimilli et al.
patent: 6308257 (2001-10-01), Theogarajan et al.
patent: 6460116 (2002-10-01), Mahalingaiah
patent: 6564298 (2003-05-01), Jourdan et al.
Intrater, et al., “Performance Evaluation of a Decoded Instruction Cache for Variable Instruction-Length Computers”, © 1992 ACM, p. 106-113.
Jourdan Stephan J.
Kyker Alan
Peugh Brian R.
Sparks Donald
LandOfFree
Front end system having multiple decoding modes does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Front end system having multiple decoding modes, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Front end system having multiple decoding modes will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3247740