Electrical computers and digital processing systems: memory – Address formation – Address mapping
Reexamination Certificate
2000-08-31
2003-10-14
Kim, Hong (Department: 2186)
Electrical computers and digital processing systems: memory
Address formation
Address mapping
C711S210000, C710S003000
Reexamination Certificate
active
06633967
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to computer architectures and, more specifically, to mechanisms for translating memory addresses in distributed, shared memory multiprocessor computer systems.
2. Background Information
Distributed shared memory computer systems, such as symmetric multiprocessor (SMP) systems, support high-performance application processing. Conventional SMP systems include a plurality of processors coupled together by a bus. One characteristic of SMP systems is that memory space is typically shared among all of the processors. That is, each processor accesses programs in the shared memory, and processors communicate with each other via that memory (e.g., through messages and status information left in shared address spaces). In some SMP systems, the processors may also be able to exchange signals directly. One or more operating systems are typically stored in the shared memory. These operating systems control the distribution of processes or threads among the various processors. The operating system kernels may execute on any processor, and may even execute in parallel. By allowing many different processors to execute different processes or threads simultaneously, the execution speed of a given application may be greatly increased.
FIG. 1
is a block diagram of a conventional SMP system
100
. System
100
includes a plurality of processors
102
a-e
, each connected to a system bus
104
. A memory
106
and an input/output (I/O) bridge
108
are also connected to the system bus
104
. The I/O bridge
108
is also coupled to one or more I/O busses
110
a-c
. The I/O bridge
108
basically provides a “bridging” function between the system bus
104
and the I/O busses
110
a-c
. Various I/O devices
112
, such as disk drives, data collection devices, keyboards, CD-ROM drives, etc., may be attached to the I/O busses
110
a-c
. Each processor
102
a-e
can access memory
106
and/or various input/output devices
112
via the system bus
104
. Each processor
102
a-e
has at least one level of cache memory
114
a-e
that is private to the respective processor
102
a-e.
The cache memories
114
a-e
typically contain an image of data from memory
106
that is being utilized by the respective processor
102
a-e
. Since the cache memories of two processors (e.g., caches
114
b
and
114
e
) may contain overlapping or identical images of data from main memory
106
, if one processor (e.g., processor
102
b
) were to alter the data in its cache (e.g., cache
114
b
), the data in the other cache (e.g., cache
114
e
) would become invalid or stale. To prevent the other processor (e.g., processor
102
e
) from acting on invalid or stale data, SMP systems, such as system
100
, typically include some type of cache coherency protocol.
In general, cache coherency protocols cause other processors to be notified when an update (e.g., a write) is about to take place at some processor's cache. Other processors, to the extent they also have copies of this same data in their caches, may then invalidate their copies of the data. The write is typically broadcast to the processors which then update the copies of the data in their local caches. Protocols or algorithms, some of which may be relatively complex, are often used to determine which entries in a cache should be overwritten when more data than can be stored in the cache is received.
Processors, such as processors
102
, typically refer to program instructions and data by their “logical addresses”, which are independent of that information's location in memory
106
. Accordingly, as information is loaded into memory
106
(e.g., from disks or tape drives), logical addresses from the processors
102
must be translated to “physical addresses” that specify the actual locations of the respective information within memory
106
. Accordingly, each processor
102
also includes an address translation device, typically a translation look-aside buffer (TLB)
116
a-e
. The TLBs
116
translate logical addresses to physical addresses. As information is brought into and moved around within memory
106
, the information in the TLBs
116
a-e
must be updated. Typically, when the information in one or more TLBs
116
needs to be updated, the operating system executes a translation buffer invalidate all (TBIA) function or instruction sequence. A TLB, e.g., TLB
116
c
, needs to be updated each time its processor, e.g., processor
102
c
, changes context from one thread to another or when a new page is mapped to or removed from the specific process context. As part of the TBIA, which is specifically software initiated, the processors
102
flush the entire contents of their TLBs
116
, and return acknowledgments meants to the operating system. When all processors
102
have acknowledged the flushing of their TLBs
116
, the new data is copied into the TLBs. All TLB entries are flushed in order to simplify the operating system software charged with executing the TBIA. For example, TBIAs do not need to specify the address of any TLB entries to be invalidated; they are all invalidated.
In addition to the TBIA, some systems are capable of executing a translation buffer invalidate single (TBIS) function. Here, only a single TLB entry is invalidated. However, execution of the TBIS function is generally more completed than the TBIA as the TBIS must identify and specify the TLB entry to be invalidated.
I/O bridge
108
may also include a TLB
118
. The I/O TLB
118
is used to translate addresses from the I/O domain (i.e., addresses specified by I/O devices
112
) to physical addresses of memory
106
(i.e., system addresses). There are basically two ways of translating I/O domain addresses to system addresses. First, I/O addresses may be “direct mapped” to system addresses. With direct mapping, there is a one to one linear mapping of a region of I/O address space to a contiguous address space of the same size within memory
106
. The translation of a direct mapped I/O domain address to a system address is relatively straightforward. In particular, a base address, which specifies where in memory
106
the direct mapped I/O space begins, is typically concatenated with some portion of the I/O domain address itself (i.e., an “offset”) to generate the translated system address. In addition to direct mapping, I/O domain addresses may be “scatter gather” mapped, which is sometimes also called graphics address relocation table (GART) mapping. With scatter gather or GART mapping, the I/O address space is broken up (typically into blocks or pages) and distributed or “scattered” about the memory space of memory
106
. To translate an I/O domain address that is scatter gather mapped, the I/O TLB
118
is used. More specifically, the I/O TLB
118
keeps track of where the I/O addressed blocks are located within the space of memory
106
so that any selected I/O addressed block may be “gathered” upon request by an I/O device
112
.
To keep the contents of the I/O TLB
118
up-to-date, it may also be subject to a TBIA instruction sequence from the operating system. That is, when the contents of I/O TLB
118
need to be updated, an I/O TBIA is initiated. The contents of all I/O TLBs
118
are flushed and replaced with current information. It is also possible for software to be configured in order to execute a TBIS function on I/O TLBs. These software initiated coherency protocols have generally proven sufficient for computer systems having relatively few I/O bridges
108
, and thus relatively few I/O TLBs
118
. As the number of I/O bridges and thus the number of I/O TLBs increases, however (so as to support additional I/O devices
112
by the system
100
, for example), the processing of TBIA and/or TBIS instruction sequences for I/O TLBs begins to consume significant processing and memory resources. It may also take some time to complete the I/O TBIA and/or TBIS functions if there are many I/O bridges
108
. While the I/O TBIA and/or TBIS are in process, I/O devices
112
whose memory space has been scatter gat
Choi Woo H.
Hewlett--Packard Development Company, L.P.
Kim Hong
LandOfFree
Coherent translation look-aside buffer does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Coherent translation look-aside buffer, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Coherent translation look-aside buffer will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3135745