Method and apparatus for linking translation lookaside...

Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S135000, C711S136000, C711S143000, C711S146000, C711S121000, C711S205000, C711S206000, C711S207000, C711S141000, C711S145000

Reexamination Certificate

active

06263403

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to the field of high performance computing. More particularly, the present invention relates to the need to maintain consistency between virtual-to-physical translations that are stored in a plurality of translation lookaside buffers (TLBs).
DESCRIPTION OF THE RELATED ART
Conventional computer systems use a technique called virtual memory to simulate more logical memory than actually exists, and to allow the computer to run several programs concurrently. Concurrent user programs access main memory addresses via virtual addresses assigned by the operating system. The mapping of the virtual addresses to the physical addresses of the main memory is a process known in the art as virtual address translation. Virtual address translation can be accomplished by any number of techniques, thereby allowing the processor (or alternatively, CPU) to access the desired information in main memory. Note that many computer systems also use virtual address translation when performing I/O operations. While the discussion below relates to virtual address translation performed by processors, those skilled in the art will recognize that the discussion is also applicable to any module in a computer system that translates virtual addresses to physical address, such as an I/O module.
The virtual address and physical address spaces are typically divided into equal size blocks of memory called pages, and a page table (PT) provides the translation between virtual addresses and physical addresses. Each page table entry (PTE) typically includes the virtual address, and protection and status information concerning the page. Status information typically includes information about the type of accesses the page has undergone. For example, a dirty bit indicates there has been a modification to data in the page. Because the page tables are usually large, they are stored in main memory. Therefore, each regular memory access can actually require at least two accesses, one to obtain the translation and a second to access the physical memory location.
Many computer systems that support virtual address translation use a translation lookaside buffer (TLB). The TLB is typically a small, fast memory that is usually situated on or in close proximity to the processor unit (or other module) and stores recently used pairs of virtual-to-physical address translations in the form of PTEs. In a multiprocessor system, a TLB is typically provided for each processor. In addition, TLBs are often provided at the interface between I/O devices and main memory to facilitate fast virtual-to-physical translations, as discussed above.
The TLB contains a subset of the PTEs in the page table, and can be accessed much more quickly than the page table. When the processor information from main memory, it sends the virtual address to the TLB. The TLB accepts the virtual address page number and returns a physical page number. The physical page number is combined with low order address information to access the desired byte or word in main memory.
Typically the TLB cannot contain the entire page table, so various procedures are required to update the TLB. When a virtual address is generated by the processor, and the translation is not in the TLB, the page table is accessed to determine the translation of the virtual page number contained in the virtual address to a physical page number, and this information is entered in the TLB. Access to the page table can take twenty times longer than access to the TLB, and therefore program execution speed is optimized by keeping the translations being utilized in the TLB.
The entries of all TLBs must be kept consistent with the corresponding entries in the page table. TLB consistency is typically a software responsibility in current computer systems. Usually, software must modify a PTE in the page table, then explicitly issue a special instruction sequence to cause the appropriate TLBs in the system to invalidate any prior copies of the updated PTE. Software must then wait until all the TLBs have completed the invalidate request before continuing operation. The following pseudocode sequence illustrates a typical method of ensuring TLB consistency after a PTE has been updated in the page table:
1: // Update PTE in page table (PT)
2: PT[PTE]=newPTE;
3: // Invalidate old PTE in all TLBs
4: PURGE_TLB[VirtualPageNum];// This is an ordered operation
5: // CPU spins while waiting for PURGE_TLB to complete
6: // All TLBs observe, execute, and acknowledge the PURGE_TLB request;
7: // CPUs continue operation
8: . . .
The pause between the PURGE_TLB command at line 4 and the continuation of operation at line 7 in the above example may delay processing for hundreds or thousands of CPU cycles in large multiprocessor systems. Such systems must multicast the PURGE_TLB command to many TLBs in tightly coupled multiprocessor systems, or unicast the PURGE_TLB commands in loosely coupled multiprocessor systems to a individual TLBs that may be coupled to the main memory by a high latency interconnection fabric.
Operations which require frequent manipulation of the page table may lose many hundreds or even thousands of CPU cycles to TLB manipulation. One example of such an operation may be I/O buffer manipulation in a web or file server which employs an I/O page table for translating I/O addresses to physical memory addresses. In such a system, many PTEs may be manipulated, with each manipulation requiring a PURGE_TLB operation for each I/O request.
An added complication not shown in the above example is that many systems limit the number of PURGE_TLB requests that may be outstanding at any time, forcing the CPUs and other modules in a multiprocessor system to coordinate their page table manipulation activities, even when such coordination would not otherwise be necessary. This coordination adds software complexity and communication overhead.
In a paper entitled “DEC OSF/1 Symmetric Multiprocessing” by Jeffrey M. Denham, Paula Long, and James A. Woodward, which appeared in the Digital Technical Journal (Vol. 6, No. 3) in 1994, Denham, et al. disclosed a method of eliminating the broadcast operations commonly used in bus based systems and reducing synchronization complexity. Denham et al. proposed that the CPU that intends to modify the page table use interrupts to cause software executing on other CPUs to issue their own PURGE_TLB commands to their local TLBs. By directing interrupts only to CPUs which share PTEs, the broadcast nature of normal PURGE_TLB operations can be avoided, thereby allowing a solution which scales with the number of CPUs sharing a common page table. Synchronization is simplified since interrupts are normally serialized by hardware. The method disclosed by Denham et al. still has the disadvantage of costing hundreds or thousands of CPU cycles, as the CPU modifying the page table cannot continue until the other interrupted CPUs observe, process, and acknowledge the interrupt. Also, the interrupted CPUs lose hundreds of cycles since they must service the interrupt, purge their local TLBs, and spin while waiting for the operation to complete.
In a paper entitled “FUGU: Implementing Translation and Protection in a Multiuser, Multimodel Multiprocessor” by Kenneth Mackenzie, John Kubiatowicz, Anant Agarwal, and Frans Kaashoek, which was published on Oct. 24, 1994, Mackenzie, et al. describe a related approach for tying hardware cache coherency with TLB consistency. In their approach, when memory controller hardware observes that a PTE is likely to be modified, the memory controller invokes a CPU interrupt that causes software to send invalidate requests to the appropriate TLBs within the system. This approach appears to require more overhead than traditional approaches because software needs to perform a context switch to service the interrupt, in addition to performing the PURGE_TLB operations described above. However, the advantage of this approach over the traditional approach and the method disclosed by Denham et al. appears to be

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for linking translation lookaside... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus for linking translation lookaside..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for linking translation lookaside... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2442944

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.