Electrical computers and digital processing systems: memory – Address formation – Address mapping
Reexamination Certificate
1998-05-26
2001-09-04
Kim, Matthew (Department: 2186)
Electrical computers and digital processing systems: memory
Address formation
Address mapping
C711S206000, C711S152000, C711S208000, C711S154000
Reexamination Certificate
active
06286090
ABSTRACT:
FIELD OF THE INVENTION
The invention relates to multiprocessor systems and, more particularly, to the efficient and selective ordering of memory reference operations issued by a processor of a multiprocessor system.
BACKGROUND OF THE INVENTION
Multiprocessor systems, such as symmetric multi-processors, provide a computer environment wherein software applications may operate on a plurality of processors using a single address space or shared memory abstraction. In a shared memory system, each processor can access any data item without a programmer having to worry about where the data is or how to obtain its value; this frees the programmer to focus on program development, e.g., algorithms, rather than managing partitioned data sets and communicating values. Interprocessor synchronization is typically accomplished in such a shared memory system between processors performing read and write operations to “synchronization variables” either before and after accesses to “data variables”.
For instance, consider computer program example #1 wherein a processor P
1
updates a data structure and processor P
2
reads the updated structure after synchronization. Typically, this is accomplished by P
1
updating data values and subsequently setting a semaphore or flag variable to indicate to P
2
that the data values have been updated. P
2
checks the value of the flag variable and, if set, subsequently issues read operations (requests) to retrieve the new data values. Note the significance of the term “subsequently” used above; if P
1
sets the flag before it completes the data updates or if P
2
retrieves the data before it checks the value of the flag, synchronization is not achieved. The key is that each processor must individually impose an order on its memory references for such synchronization techniques to work. The order described above is referred to as a processor's inter-reference order. Commonly used synchronization techniques require that each processor be capable of imposing an inter-reference order on its memory reference operations.
Computer program example #1
P1
P2
Store
Data, New-value
L1:
Load
Flag
Store
Flag, 0
BNZ
L1
Load
Data
The inter-reference order imposed by a processor is defined by its memory reference ordering model or, more commonly, its consistency model. The consistency model for a processor architecture specifies, in part, a means by which the inter-reference order is specified. Typically, the means is realized by inserting a special memory reference ordering instruction, such as a Memory Barrier (MB) or “fence”, between sets of memory reference instructions. Alternatively, the means may be implicit in other opcodes, such as in “test-and-set”. In addition, the model specifies the precise semantics (meaning) of the means. Two commonly used consistency models include sequential consistency and weak-ordering, although those skilled in the art will recognize that there are other models, such as release consistency, that may be employed.
In a sequentially consistent system, the order in which memory reference operations appear in an execution path of the program (herein referred to as the “I-stream order”) is the inter-reference order. Additional instructions are not required to denote the order simply because each load or store instruction is considered ordered before its succeeding operation in the I-stream order. Consider computer program example #1 above. The program performs as expected on a sequentially consistent system because the system imposes the necessary inter-reference order. That is, P
1
's first store instruction is ordered before P
1
's store-to-flag instruction. Similarly, P
2
's load flag instruction is ordered before P
2
's load data instruction. Thus, if the system imposes the correct inter-reference ordering and P
2
retrieves the value
0
for the flag, P
2
will also retrieve the new value for data.
In a weakly-ordered system, an order is imposed between selected sets of memory reference operations, while other operations are considered unordered. One or more MB instructions are used to indicate the required order. In the case of an MB instruction defined by the Alpha® 21264 processor instruction set, the MB denotes that all memory reference instructions above the MB (i.e., pre-MB instructions) are ordered before all reference instructions after the MB (i.e., post-MB instructions). However, no order is required between reference instructions that are not separated by an MB, except in specific circumstances such as when two references are directed to the same address.
Computer program example #2
P1:
P2:
Store
Data1, New-value1
L1:
Load
Flag
Store
Data2, New-value2
BNZ
L1
MB
MB
Store
Flag, 0
Load
Data1
Load
Data2
In above example, the MB instruction implies that each of P
1
's two pre-MB store instructions are ordered before P
1
's store-to-flag instruction. However, there is no logical order required between the two pre-MB store instructions. Similarly, P
2
's two post-MB load instructions are ordered after the Load flag; yet, there is no order required between the two post-MB loads. It can thus be appreciated that weak ordering reduces the constraints on logical ordering of memory references, thereby allowing a processor to gain higher performance by potentially executing the unordered sets concurrently.
Most computer systems use virtual memory to effectively manage physical memory of the systems. In a virtual memory system, programs use virtual addresses to address memory space allocated to them. The virtual addresses are translated to physical addresses which denote the actual locations in physical memory. A common process for managing virtual memory is to divide the virtual and physical memory into equal-sized pages. A system disk participates in the implementation of virtual memory by storing pages of the program not currently in physical memory. The loading of pages from the disk to physical memory is managed by the operating system.
When a program references an address in virtual memory, the processor calculates the corresponding main memory physical address in order to access data at that address. The processor typically includes a memory management unit that performs the translation of the virtual address to a physical address. Specifically, for each program there is a page table containing a list of mapping entries, i.e., page table entries (PTEs), which, in turn, contain the physical address of each virtual page of the program.
FIG. 1
is a schematic diagram of a prior art page table
100
containing a plurality of PTEs
110
. An upper portion, i.e., the virtual page number (VPN
122
), of a virtual address
120
is used to index into the page table
100
to access a particular PTE
110
; the PTE contains a page frame number (PFN)
112
identifying the location of the page in main memory. A lower portion, i.e., page offset
124
, of the virtual address
120
is concatenated to the PFN
112
to form the physical address
130
corresponding to the virtual address. Because of its large size, the page table is generally stored in main memory; thus, every program reference to access data in the system typically requires an additional memory access to obtain the physical address, which increases the time to perform the address translation.
To reduce address translation time, a translation buffer (TB) is used to store translation maps of recently accessed virtual addresses. The TB is similar to a data cache in that the TB contains a plurality of entries, each of which includes a tag field for holding portions of the virtual address and a data field for holding a PFN; thus, the TB functions as a cache for storing most-recently-used PTEs. When the processor requires data, the virtual address is provided to the TB and, if there is a match with the contents of the tag field, the virtual address is translated into a physical address which is used to access the data cache. If there is not a match between the virtual address and contents of the tag field, a TB miss occurs. In response to
Gharachorloo Kourosh
Sharma Madhumitra
Steely, Jr. Simon C.
Van Doren Stephen R.
Anderson Matthew D.
Cesari and McKenna
Compaq Computer Corporation
Kim Matthew
LandOfFree
Mechanism for selectively imposing interference order... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Mechanism for selectively imposing interference order..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Mechanism for selectively imposing interference order... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2504915