Electrical computers and digital data processing systems: input/ – Input/output data processing – Direct memory accessing
Reexamination Certificate
2000-10-20
2003-07-15
Huynh, Kim (Department: 2182)
Electrical computers and digital data processing systems: input/
Input/output data processing
Direct memory accessing
C710S026000, C710S028000, C710S036000, C709S212000
Reexamination Certificate
active
06594712
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates in general to the Infiniband high-speed serial link architecture, and more particularly to a method for performing remote direct memory access data transfers through the architecture.
2. Description of the Related Art
The need for speed in transferring data between computers and their peripheral devices, such as storage devices and network interface devices, and between computers themselves is ever increasing. The growth of the Internet is one significant cause of this need for increased data transfer rates.
The need for increased reliability in these data transfers is also ever growing. These needs have culminated in the development of the Infiniband™ Architecture (IBA), which is a high speed, highly reliable, serial computer interconnect technology. The IBA specifies interconnection speeds of 2.5 Gbps (Gigabits per second), 10 Gbps and 30 Gbps between IB-capable computers and I/O units, referred to collectively as IB end nodes.
One feature of the IBA that facilitates high-speed data transfers is the Remote Direct Memory Access (RDMA) operation. The IBA specifies an RDMA Write and an RDMA Read operation for transferring large amounts of data between IB nodes. The RDMA Write operation is performed by a source IB node transmitting one or more RDMA Write packets including payload data to the destination IB node. The RDMA Read operation is performed by a requesting IB node transmitting an RDMA Read Request packet to a responding IB node and the responding IB node transmitting one or more RDMA Read Response packets including payload data.
One useful feature of RDMA Write/Read packets is that they include a virtual address identifying a location in the system memory of the destination/responding IB node to/from which the data is to be transferred. That is, an IB Channel Adapter in the destination/responding IB node performs the virtual to physical translation. This feature alleviates the operating system in the destination/responding IB node from having to perform the virtual to physical translation. This facilitates, for example, application programs being able to directly specify virtual addresses of buffers in their system memory without having to involve the operating system in an address translation, or even more importantly, in a copy of the data from a system memory buffer to an application memory buffer.
An IB Channel Adapter (CA) is a component in IB nodes that generates and consumes IB packets, such as RDMA packets. A Channel Adapter connects a bus within the IB node that is capable of accessing the IB node memory, such as a PCI bus, processor bus or memory bus, with the IB network. In the case of an IB I/O node, the CA also connects I/O devices such as disk drives or network interface devices, or the I/O controllers connected to the I/O devices, with the IB network. A CA on an IB I/O node is commonly referred to as a Target Channel Adapter (TCA) and an IB processor node is commonly referred to as a Host Channel Adapter (HCA).
A common example of an IB I/O node is a RAID (Redundant Array of Inexpensive Disks) controller or an Ethernet controller. An IB I/O node such as this typically includes a local processor and local memory coupled together with a TCA, and I/O controllers connected to I/O devices. The conventional method of satisfying an RDMA operation in such an IB I/O node is to buffer the data in the local memory when transferring data between the I/O controllers and the IB network.
For example, in performing a disk read operation, the local processor on the IB I/O node would program the I/O controller to fetch data from the disk drive. The I/O controller would transfer the data from the disk into the local memory. Then the processor would program the TCA to transfer the data from the local memory to the IB network.
For a disk write, The TCA would receive the data from the IB network and transfer the data into the local memory. Then the processor would program the I/O controller to transfer the data from the local memory to the disk drive. This conventional approach is referred to as “double-buffering” the data since there is one transfer across the local bus into memory and another transfer across the local bus out of memory.
The double-buffering solution has at least two drawbacks. First, the data transfers into and out of memory consume twice as much of the local memory and local bus bandwidth as a direct transfer from the I/O controller to the TCA. This may prove detrimental in achieving the high-speed data transfers boasted by the IBA.
To illustrate, assume the local bus is a 64-bit wide 66 MHz PCI bus capable of sustaining a maximum theoretical bandwidth of 4 Gbps. With the double buffering solution, the effective bandwidth of the PCI bus is cut in half to 2 Gbps. Assuming a realistic efficiency on the bus of 80%, the effective bandwidth is now 1.6 Gbps. This is already less than the slowest transfer rate specified by IB, which is 2.5 Gbps.
To illustrate again, assume the local memory controller is a 64-bit wide, 100 MHz SDRAM controller capable of sustaining a maximum theoretical bandwidth of 6 Gbps. Again, assuming the conventional double buffering solution and an 80% efficiency yields an effective bandwidth of 2.4 Gbps. Clearly, this leaves no room in such an I/O node architecture for expansion to the higher IB transfer speeds.
The second drawback of the double buffering solution is latency. The total time to perform an I/O operation is the sum of the actual data transfer time and the latency period. The latency is the time involved in setting up the data transfer. No data is being transferred during the latency period. The double buffering solution requires more time for the local processor to set up the data transfer. The local processor not only sets up the initial transfer into local memory, but also sets up the transfer out of memory in response to an interrupt signifying completion of the transfer into local memory.
As data transfer rates increase, the data transfer component of the overall I/O operation time decreases. Consequently, the local processor execution latency time becomes a proportionately larger component of the overall I/O operation time, since the processor latency does not typically decrease proportionately to the data transfer time. The negative impact of latency is particularly detrimental for I/O devices with relatively small units of data transfer such as network interface devices transferring IP packets. Thus, the need for reducing or eliminating latency is evident.
Therefore, what is needed is an IB CA capable of transferring data directly between a local bus, such as a PCI bus, and an IB link without double buffering the data in local memory.
SUMMARY
To address the above-detailed deficiencies, it is an object of the present invention to provide an Infiniband channel adapter that transfers data directly between a local bus and an Infiniband link without double buffering the data in system memory. Accordingly, in attainment of the aforementioned object, it is a feature of the present invention to provide an Infiniband channel adapter that includes a local bus interface for coupling the channel adapter to an I/O controller by a local bus. The local bus interface receives data from the I/O controller if a local bus address of the data is within a predetermined address range of the local bus address space. The channel adapter also includes a bus router, in communication with the local bus interface, that creates an Infiniband RDMA Write packet including the data in response to the local bus interface receiving the data from the I/O controller. The channel adapter then transmits the created packet to a remote Infiniband node that previously requested the data.
An advantage of the present invention is that it avoids the reduction in useable bandwidth of the local bus and of a system memory by not double-buffering the data, but instead transferring the data directly from the I/O controller to the channel adapter for transmission on the Infiniband wire. Another
Pettey Christopher
Rubin Lawrence H.
Banderacom, Inc.
Davis E. Alan
Huffman James W.
Huynh Kim
LandOfFree
Inifiniband channel adapter for performing direct DMA... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Inifiniband channel adapter for performing direct DMA..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Inifiniband channel adapter for performing direct DMA... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3074994