Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories
Reexamination Certificate
1999-12-20
2003-07-15
Bragdon, Reginald G. (Department: 2188)
Electrical computers and digital processing systems: memory
Storage accessing and control
Hierarchical memories
C711S207000, C711S144000, C711S125000
Reexamination Certificate
active
06594734
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to software backward compatibility in advanced microprocessor hardware, and specifically to cache coherency and self modifying code detection using an instruction translation lookaside buffer.
2. Background Information
When a new computer is introduced it is oftentimes desirable to operate older application or operating system software in conjunction with new hardware designs. Previously dynamic memory was relatively expensive so older computer systems, upon which older application and operating system software would execute, had limited sizes available. Thus the older software applications and operating systems would use techniques to maximize the use of the available memory space in memory. Additionally, most computers share a common memory space for storing both Instructions and Data. This shared memory space allows Data Stores into memory to modify previously stored Instructions which are generally referred to as code. Under, strict timing requirements, this code modification can occur actively when a program is being executed. This occurrence of the program itself modifying an instruction within program memory is commonly referred to as self modifying code. In some cases, a Data Store that modifies the next instruction which is to be executed by the computer may require that the modification become effective before the next instruction can be executed. In other cases, the code modification may be the result of the computer system copying a new program into the physical memory that previously contained another program. In modern computer systems, multiple agents may be responsible for updating memory. In Multi Processor systems, each processor can perform stores to memory which could modify instructions to be executed by any or all processors in the system. Direct Memory Access (DMA) agents, such as disk controllers, can also perform stores to memory and thereby modify code. These code modifications are commonly referred to as cross modifying code. Hereinafter, all forms of code modification of previously stored instructions within memory of a computer system during program execution, regardless of whether it includes a single processor, a multi processor, or a DMA agent, are referred to as self modifying code (SMC). The definition of self modifying code as used herein includes the more common references to self modifying code and cross modifying code.
In order to speed up program execution, cache memory was introduced into computers and microprocessors. An instruction cache memory is a fast memory of relatively small size used to store a large number of instructions from program memory. Typically, an instruction cache has between 32 to 128 byte cache lines in which to store instructions. Program memory, more commonly referred to simply as memory, is usually a semiconductor memory such as dynamic random access memory or DRAM. In a computer without an instruction pipeline or instruction cache memory to store instructions mirroring a portion of the program memory, self modifying code posed no significant problem. With the introduction of instruction pipelines and cache memory into computers and their microprocessors, self modifying code poses a problem. To avoid executing an old instruction stored within an instruction pipeline or an instruction cache memory, it is necessary to detect a self modifying code condition which updates program memory. This problem can be referred to as cache coherency or pipeline coherency where the instruction cache or pipeline becomes incoherent (or stale) as compared with program memory after self modifying code occurs. This is in contrast to the problem of memory coherency where the cache is updated and memory is stale or incoherent.
In previous microprocessors manufactured by Intel Corporation, such as the Intel 80486 processor family, instructions from program memory were stored within an instruction pipeline to be executed “In-Order”. In these “In-Order” processors, SMC detection was performed by comparing the physical address of all stores to program memory against the address of all instructions stored within the instruction pipeline. This comparison was relatively easy because the number of instructions in the instruction pipeline was typically limited to four or five instructions. If there was an address match, it indicated that a memory location was modified, an instruction was invalid in the instruction pipeline and that the present instruction pipeline should be flushed (that is disregarded or ignored) and new instructions fetched from program memory to overwrite the flushed instructions. This comparison of addresses is generally referred to as a snoop. With a deeper instruction pipeline, snoops require additional hardware because of the additional instructions having additional addresses requiring comparison.
In another previous microprocessor manufactured by Intel Corporation, such as the Intel P6 of Pentium™ II processor family, instructions from program memory were stored within an instruction cache memory for execution by an “Out of Order” core execution unit. “Out of Order” instruction execution is preferable in order to provide more parallelism in instruction processing. Referring now to
FIG. 1
, a block diagram of a prior art microprocessor
101
coupled to memory
104
is illustrated. The Next Instruction process (IP)
110
, also referred to as an instruction sequencer, is a state machine and branch prediction unit that builds the flow of execution of the microprocessor
101
. To support page table virtual memory accesses, the microprocessor
101
includes an instruction translation lookaside buffer (ITLB)
112
. The ITLB
112
includes page table entries of linear to physical address translations into memory
104
. Usually the page table entries represent the most recently used pages of memory
104
which point to a page of memory in the instruction cache
114
. Instructions are fetched over the memory bus
124
by the memory controller
115
from memory
104
for storage into the instruction cache
114
. In the prior art, the instruction cache
114
is physically addressed. A physical address is the lowest level of address translation and points to an actual physical location associated with physical hardware. In contrast, a linear address is an address associated with a program or other information that does not directly point into a memory, cache memory or other physical hardware. A linear address is linear relative to the program or other information. Copies of instructions within memory
104
are stored within the instruction cache
114
. Instructions are taken from the instruction cache
114
, decoded by the instruction decoder
116
and input into an instruction pipeline within the out of order core execution unit
118
. Upon completion by the out of order core execution unit
118
, an instruction is retired by the retirement unit
120
. The retirement unit
120
processes instructions in program order after they have completed execution. Retirement processing includes checking for excepting conditions (such as an occurrence of self modifying code) and committing changes to architectural state. That is, the out of order core execution unit
118
executes instructions which can be completely undone before being output by the microprocessor if some excepting condition has occurred which the retirement unit has recognized.
In “Out-Of-Order” processors, such as microprocessor
101
, the number of instructions in the instruction pipeline are so great that it is impractical to compare all instructions in the pipeline of the microprocessor
101
with all modifications of program memory to be certain no changes have occurred. To do so would require too much hardware. In the prior art microprocessor
101
, this problem was solved by having all store instructions executed by the out of order execution unit
118
, which would execute a store instruction into the memory
104
or into a data cache within the execution unit
118
, trigg
Fernando Roshan
Kyker Alan
Lee Chan
Pandya Vihang D.
LandOfFree
Method and apparatus for self modifying code detection using... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for self modifying code detection using..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for self modifying code detection using... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3096669