Electrical computers and digital processing systems: memory – Address formation – Address mapping
Reexamination Certificate
2001-05-24
2004-08-03
Nguyen, T. (Department: 2187)
Electrical computers and digital processing systems: memory
Address formation
Address mapping
C711S117000, C711S118000, C711S202000, C711S203000, C711S204000, C711S205000, C711S206000
Reexamination Certificate
active
06772315
ABSTRACT:
BACKGROUND
Microprocessors, including those of the X86 and Pentium families of processors available from Intel, Inc., execute instructions and manipulate data stored in a main memory, typically some amount of dynamic random-access memory, or DRAM. Modern processors execute instructions far faster than instructions and data can be made available by reasonably priced DRAM. DRAM access times thus adversely affect processor performance.
Cache memory offers the most common solution to the DRAM bottleneck. Modern processors still use relatively slow and inexpensive DRAM for main memory, but also include a smaller amount of fast, expensive static RAM (SRAM) cache memory. The SRAM cache maintains copies of frequently accessed information read from DRAM. The processor then looks for instructions and data in the cache memory before resorting to the slower main memory.
Modern computer systems must typically reference a large number of stored programs and associated program information. The size of this information necessitates an economical mass storage system, which is typically comprised of magnetic disk storage. The access time of this mass storage is very long compared to access times of semiconductor memories such as SRAM or DRAM, motivating the use of a memory hierarchy. The concept of virtual memory was created to simplify addressability of information within the memory hierarchy and sharing of information between programs. The following is a formal definition of the term “virtual memory,” provided in a classic text on the subject:
“Virtual memory is a hierarchical storage system of at least two levels, managed by an operating system (OS) to appear to the user as a single, large, directly-addressable main memory.”
Computer Organization
, 3
rd
ed
., V. C. Hamacher, Z. G. Vranesic, S. G. Zacky, McGraw-Hill, New York, 1990). Further elaboration is provided in another commonly referenced text:
“The main memory can act as a ‘cache’ for the secondary storage, usually implemented with magnetic disks. This technique is called virtual memory. There are two major motivations for virtual memory: to allow efficient and safe sharing of memory among multiple programs and to remove the programming burden of a small, limited amount of main memory.”
Computer Organization and Design: The Hardware/Software Interface
, 2
nd
edition
, David A. Patterson and John L. Hennessy, Morgan Kaufmann Publishers, Inc., San Francisco, Calif., 1998.
The upper levels of modern memory hierarchies typically include cache and main memory. “Cache is the name first chosen to represent the level of memory hierarchy between the CPU and main memory” (
Computer Architecture, a Quantitative Approach
, by Hennessy and Patterson, 1990, p408). “Main memory satisfies the demands of caches and vector units, and serves as the I/O interface as it is the destination of input as well as the source for output” (Id. at p. 425). Most main memories are composed of dynamic random-access memories, or DRAMs, while most caches are relatively faster static random-access memories, or SRAMs (Id. at p. 426). Most. modern systems subdivide the memory into pages (commonly, 4KB in size), and the OS swaps pages between the main memory and the disk storage system based on an appropriate page allocation and replacement scheme.
The virtual memory is addressed by virtual addresses, which must be translated to physical addresses before cache or main-memory accesses can occur. The translation is typically performed by an address translation unit in the processor, which accesses address translation information stored in the main memory. In an x86 architecture processor, the address translation information is stored hierarchically in the form of a Page Directory consisting of multiple Page Directory Entries (or PDES). Each PDE, in turn, references a Page Table consisting of multiple Page Table Entries (or PTEs). Each PTE, in turn, contains the physical address and attribute bits of the referenced page or Page Frame. For the specification of this invention, the translation information will be referred to herein generically as “address translation information” (ATI) and the structures used to store this information will be referred to herein as “address translation tables.” The terms “page tables,”“page directories,” or “page tables and page directories”may be used interchangeably with “address translation tables.”
Address translation tables are stored in main memory. Address translations that must reference this information thus suffer the same speed penalty as other references to main memory: namely, the CPU must wait many clock cycles while the system produces the physical address associated with a corresponding virtual address. Once again, cache memory offers the most common solution to the DRAM bottleneck. In this case, however, the cache is an address translation cache that stores the most commonly referenced set of virtual page addresses and the physical page address associated with each stored virtual page address. Using this scheme, the vast majority of address translations can be accomplished without the speed penalty associated with a request from main memory by providing the required physical address directly from the address translation cache after a small lookup time. Address translation caches are commonly referred to as translation look-aside buffers (TLBs), page translation caches (PTCs) or “translation buffers” (TBs). The term TLB will be used throughout the remainder of this specification to represent the aforementioned type of address translation cache. Many CPUs include more than one TLB due to a variety of reasons related to performance and implementation complexity.
Conventional microprocessor/main memory combinations are well understood by those of skill in the art. The operation of one such combination is nevertheless described below to provide context for a discussion of the invention.
FIG. 1
depicts a portion of a conventional computer system
100
, including a central processing unit (CPU)
102
connected to a memory controller device
104
via a system bus
106
. The memory controller device
104
acts as a bridge between CPU
102
and main memory
108
. Other terms are often used in the computer industry to describe this type of bridge device, including “north bridge,” “memory controller hub,” or simply “memory controller.” This device is often sold as part of a set of devices, commonly referred to as the system “chip set.” Throughout this specification, the term “memory controller device” will be used to refer to the device that serves as the main memory bridge, while the term “memory controller” will refer more narrowly to the block of logic which controls main memory access.
Memory controller device
104
is connected to a main memory
108
via a communication port
110
and to an IO controller
132
via an IO controller interface
150
. Other interfaces may be optionally provided as part of memory controller device
104
, but those interfaces are beyond the scope of this specification. System bus
106
conventionally includes address lines
140
, data lines
142
, and control lines
144
. Communication port
110
likewise includes main-memory address lines, data lines, and control lines. Most interfaces also include a synchronization mechanism consisting of one or more clocks or strobes, although, for simplicity, these clocks are not shown in the figures herein.
IO controller
132
interfaces to peripherals
112
via one or more peripheral interfaces
114
. Peripherals might comprise one or more of the following: a keyboard or keyboard controller, hard disk drive(s), floppy disk drive(s), mouse, joystick, serial I/O, audio system, modem, or Local Area Network (LAN). Peripherals are mentioned here for clarification purposes although the specific set of peripherals supported and means of interfacing to them are omitted for brevity.
CPU
102
includes a CPU core
116
, which includes an address generation unit
118
, an address translation unit
122
, a bus unit
124
, and a cache memory
126
. Address generation unit
118
Behiel Arthur J.
Nguyen T.
Rambus Inc
Silicon Edge Law Group LLP
LandOfFree
Translation lookaside buffer extended to provide physical... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Translation lookaside buffer extended to provide physical..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Translation lookaside buffer extended to provide physical... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3281458