Electrical computers and digital data processing systems: input/ – Input/output data processing – Peripheral configuration
Reexamination Certificate
2002-04-17
2004-04-20
Gaffin, Jeffrey (Department: 2182)
Electrical computers and digital data processing systems: input/
Input/output data processing
Peripheral configuration
C718S001000, C710S002000, C710S003000, C711S002000, C711S100000, C711S152000, C711S165000, C711S200000, C711S202000, C711S203000, C711S205000, C711S206000, C711S207000, C711S208000, C711S209000, C711S220000, C711S221000
Reexamination Certificate
active
06725289
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to the field of memory management in computers, in particular in the context of address mapping in order to improve I/O speed.
2. Description of the Related Art
Many computer systems depend for their speed and efficiency on the ability to rapidly transfer data between devices and system memory. In many cases, however, addressing conventions and restrictions make it necessary to perform intermediate copies of data to be transferred before the final transfer can actually take place. Such copying can severely slow down the transfer rate.
One widely used method for increasing the input/output (“I/O”—either or both) speed between certain devices (or other processes) and memory is known as “direct memory access” (DMA). DMA is a capability provided by some computer bus architectures that allows data to be sent directly from an attached device (such as a disk drive) to system memory, without intermediate action by the processor. In order to implement DMA, a portion of system memory is usually designated as an area to be used specifically for DMA operations. Obviously, time is lost whenever a block of data (such as a “page” that is not already in the designated memory portion) must be copied to or from the designated memory portion to perform a DMA transfer.
As a concrete example, modern Intel x86 processors support a physical address extension (PAE) mode that allows the hardware to address up to 64 GB of memory using 36-bit addresses. Unfortunately, many devices that directly access memory to perform I/O operations can address only a subset of this memory. For example, network interface cards with the common 32-bit PCI (Peripheral Component Interconnect) interface can address memory residing in only the lowest 4 GB of memory, even on systems that support up to 64 GB of memory. Other 32-bit PCI devices can access memory above 4 GB using a technique known as DAC (Dual Address Cycle), but this technique requires two address transfers—one for the low 32 bits and another for the high 32 bits.
One known way to support output to “high” memory (that is, memory above 4 GB) is to copy the data from high memory to a temporary buffer in “low” memory for the DMA operation. For input operations, a portion of low memory in the temporary buffer is allocated for storage of the input data, which can then be copied to high memory. This technique is employed, for example, by the Linux 2.4 kernel, which uses the term “bounce buffer” to describe the temporary buffering and copying process. Unfortunately, copying can impose significant overhead, which results in turn in increased latency, reduced throughput, and/or increased CPU load when performing I/O.
Another known technique is the remapping of memory regions (in particular, pages) as described in U.S. Pat. No. 6,075,938, Bugnion, et al., “Virtual Machine Monitors for Scalable Multiprocessors,” issued Jun. 13, 2000 (“Bugnion '938”). The basic idea of this system, which operates in the context of a NUMA (non-uniform memory access) multi-processor, is that memory pages associated with hardware memory modules that are farther away (defined in terms of access latency) are migrated or replicated by making copies in hardware memory modules closer to a process that is accessing them. The process page mappings are modified transparently to use the local page copy instead of the original remote page. In other words, the Bugnion '938 system attempts to improve access speed by improving memory locality. The problem when it come to I/O, in particular in the context of DMA, is, however, not that of whether a certain memory space is sufficiently local, but rather, often, whether it can be accessed at all.
Still other existing systems enable I/O to “high” memory by including special hardware components that provide support for memory remapping. For example, a separate I/O memory management unit (I/O MMU) may be included for I/O operations. The obvious disadvantage of this solution is its requirement for the extra hardware.
A related problem is the dynamic management of the “low” memory, which may be a scarce resource that needs to be allocated among various competing uses. In other words, if several devices or processes must compete for use of a common memory region (here, “low”) designated for high-speed I/O (such as DMA), then some mechanism must be provided to efficiently allocate its use. Such memory management is typically carried out by a component of the operating system.
What is needed is therefore a system that eliminates or at least reduces the need for copying in I/O operations to or from at least one limited memory space, especially in high-speed I/O contexts such as DMA. The system should preferably be usable not only in a conventional computer system, in particular, in its operating system, but also in computer systems that include at least one virtualized computer. Moreover, the system should preferably also be able to manage the limited memory space dynamically, and it should not require specific hardware support. This invention provides such a system and method of operation whose various aspects meet these different goals.
SUMMARY OF THE INVENTION
The invention provides a method and corresponding system implementation for performing an input/output (I/O) operation in a computer between an I/O-initiating subsystem and a device through a memory, where the memory is arranged into portions such as pages that are separately addressable using first identifiers, such as page numbers. It is assumed that, for the I/O operation, the device accesses a device-accessible space of the memory, whereas the subsystem addresses I/O requests using second (or, in the preferred virtualized embodiment, third) identifiers to some other memory space, in particular to a space of the memory that is inaccessible to the device. In other words, the subsystem does not normally address I/O requests to the region of the memory that the device accesses for I/O operations. One example of this would be DMA where the device addresses only a lower address region of the memory but the I/O-initiating subsystem addresses its requests to an upper address region.
According to the invention, a manager, in particular, a memory map within the manager, initially maps the second identifiers to respective first identifiers that identify portions of the memory in the device-inaccessible memory space. For any I/O request that meets a remapping criterion, a remapping module in the manager remaps the corresponding second identifier to one of the first identifiers that identifies a portion of the memory in the device-accessible space of the memory.
In cases where the I/O operation is output of a data set from the subsystem to the device, that is, a “write,” then for any I/O request that meets the remapping criterion, and for as long as the I/O request meets the remapping criterion, the manager creates and maintains a single copy of the data set in a buffer in the device-accessible space of the memory and remaps the I/O request to the single copy. For any I/O request that fails to meet the remapping criterion, a new copy of the data set is preferably created in the buffer upon each instance of the I/O request.
In the cases where the I/O operation is input of a data set from the device to the subsystem, that is, a “read,” then, for any I/O request that meets the remapping criterion, the data set from the device is preferably stored in the device-accessible space of the memory at a location identified by the first identifier to which the second identifier has been remapped.
One way according to the invention to decide which second identifiers are to be remapped to the device-accessible space of the memory is to calculate an activity score for at least a subset of the second identifiers used by the subsystem in an I/O request during a current measurement period. The second identifier is then remapped if its activity score exceeds a high-activity threshold value. The activity score may be calculated in different ways, for
Govil Kinshuk
Nelson Michael
Waldspurger Carl A.
Gaffin Jeffrey
Nguyen Tanh
Pearce Jeffrey
VMware, Inc.
LandOfFree
Transparent address remapping for high-speed I/O does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Transparent address remapping for high-speed I/O, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Transparent address remapping for high-speed I/O will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3240873