Performance of a PCI-X to infiniband bridge

Electrical computers and digital data processing systems: input/ – Intrasystem connection – Bus interface architecture

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C710S306000

Reexamination Certificate

active

06785760

ABSTRACT:

TECHNICAL FIELD
The present invention relates in general to methods improving the performance of hardware units used to bridge from one communication standard protocol to devices with a second communication protocol.
BACKGROUND INFORMATION
Peripheral Component Interconnect (PCI) is a peripheral bus commonly used in personal computers (PCs), Apple Macintoshes, and workstations. It was designed primarily by Intel and first appeared on PCs in late 1993. PCI provides a high-speed data path between the central processing unit (CPU) and peripheral devices (e.g., video, disk, networks, etc.). There are typically three or four PCI slots on the motherboard. In a Pentium PC, there is generally a mix of PCI and Industrial Standard Architecture (ISA) slots or PCI and Extended ISA (EISA) slots. Early on, the PCI bus was known as a “local bus.”
PCI runs at 33 MHz, supports 32 and 64-bit data paths and bus mastering. PCI Version 2.1 calls for 66 MHz, which doubles the throughput. There are generally no more than three or four PCI slots on the motherboard, which is based on 10 electrical loads that deal with inductance and capacitance. The PCI chipset uses three loads, leaving seven for peripherals. Controllers built onto the motherboard use one, whereas controllers that plug into an expansion slot use 1.5 loads. A “PCI bridge” may be used to connect two PCI buses together for more expansion slots.
PCI Extended (PCI-X) is an enhanced PCI bus from IBM, HP and Compaq that is backward compatible with existing PCI cards. It uses a 64-bit bus with a clock speed as high as 133 MHz, providing a large jump in speed from the original PCI bus at 132 MBytes/sec to as much as 1 GBytes/sec.
InfiniBand (IB) is an input/output architecture that is expected to replace the PCI-X bus in high-end servers. Supporting both copper wire and optical fibers and originally known as “System I/O,” IB is a combination of Intel's Next Generation I/O (NGIO) and Future I/O from IBM, HP and Compaq. Unlike the PCI's bus technology, IB is a point-to-point switching architecture, providing a data path of from 500 MBps to 6 GBps between each pair of nodes at distances orders of magnitude greater than PCI or PCI-X allow.
One of the interesting applications being explored is to “lengthen” and expand the PCI bus by adding a PCI to IB bridge to the Host system and connecting it to an expansion drawer via an IB link (as described in U.S. Pat. No. 6,003,105). In this model, the PCI to IB bridge monitors areas of the PC's PCI bus, translates the PCI commands to equivalent IB commands, and forwards transactions to a remote expansion unit (drawer). An IB to PCI bridge in the expansion drawer receives the transactions over the IB link, converts them back to their equivalent PCI commands, and reissues them on the PCI bus in the expansion drawer. Similar results may be achieved by adding a standard IB Host communication adapter (HCA) to the Host system (instead of the PCI to IB bridge) and writing a device driver that uses IB APIs to send IB commands to the IB to PCI bridge located in the expansion drawer which the bridge translates to equivalent PCI commands and issues to the PCI device. Another solution may add or modify Host software to monitor calls to the operating system PCI API and generate the equivalent IB commands, which are again sent to the IB to PCI bridge in the expansion drawer.
This model of translating and forwarding IB commands to the expansion drawer has drawbacks since the IB and PCI semantics do not exactly match and performance may suffer. Performing an IB command may also have much more latency than its equivalent PCI command if the Host and I/O devices are located great distances apart. For example, storage adapters often use a model similar to the following:
1. The device driver allocates a command block from its internal pool of blocks, initializes the block, and then does a single 32 bit PCI write to an adapter register with the physical address of the block. Typically the driver always writes the address to the same device register.
2. The device hardware usually puts the data from the write operation into a queue and then interrupts the firmware on the adapter. The firmware pulls the address from the queue and then programs a direct memory access (DMA) logic engine to copy the command block referenced by the address from the Host's memory to the adapter's memory. The block of data is usually a fixed size.
3. The adapter analyses the command and determines if more data is required (such as writing to a disk) and if more data is required the adapter programs its DMA engine to move the rest of the data from the Host at the address(es) provided in the command block in the adapter's memory.
4. The adapter executes the command.
5. If there is result data for the Host (i.e., a disk read), the adapter uses a DMA to send the data back to the Host at address(es) provided in the command block.
6. The adapter then interrupts the Host.
7. Finally, the Host's device driver reads the interrupt status register on the adapter, recognizes that the adapter has issued the interrupt, and then reads another hardware register which retrieves the first element in the status queue and completes the original Host I/O request.
If a direct translation is done from the preceding PCI commands to IB commands (either by software in the Host or by a PCI to IB bridge) the resulting sequence of IB commands may look like the following:
1. (Driver writes command address) Host to expansion drawer: Remote direct memory access (RDMA) 32-bits to a fixed address (the device's command register). RDMA is an IB specific command.
2. (PCI adapter starts fetching command block) Expansion drawer to Host: RDMA fixed sized block (of data) from the Host address provided in step 1 of this sequence.
3. (PCI adapter starts fetching data if more data is required) Expansion drawer to Host: RDMA a variable sized block from Host address(es) provided in the command block.
4. (PCI adapter starts sending data if there is result data for the Host) Expansion drawer to Host: RDMA variable sized block to variable Host address(es) provided in the command block.
5. (PCI adapter raises interrupt) Expansion drawer to Host: Send (with a SEND command) a small packet that tells the Host system that a PCI interrupt has been raised. SEND is an IB specific command. SEND is issued because there is no direct equivalent to a PCI interrupt in the IB specification.
6. (Driver reads the interrupt status register) Host to expansion drawer: RDMA 32-bits from a fixed address (the device's interrupt status register).
7. (Driver reads the status queue) Host to expansion drawer: RDMA 32-bits from a fixed address (the devices status queue).
The problem with this translation is that the seven round trips required may be slow when run over one of the reliable IB protocols. There is, therefore, a need for a method to improve the communication performance between PCI protocol units and IB protocol units when bridge units are available and to improve performance when bridge units are not available.
SUMMARY OF THE INVENTION
Device drivers, using the Peripheral Component Interconnect (PCI) protocol and designed to communicate with PCI I/O devices over the PCI local bus, are incorporated with PCI to InfiniBand (IB) and IB to PCI bridge units. An expansion drawer incorporates an IB to PCI bridge unit to communicate with PCI I/O adapter units in a local bus configuration. The expansion drawer communicates with the PCI to IB bridge unit in the Host system over an IB link. In one embodiment of the present invention, hardware is added to the bridge units to monitor the PCI commands issued on the local bus of the Host system. The hardware learns the PCI command sequences for PCI I/O device transactions. These PCI command sequences are optimized for the PCI transactions and stored for subsequent use. When the device driver issues a request for a PCI transaction, the stored data is searched to determine if optimized sequences have been generated and stored. If optimized

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Performance of a PCI-X to infiniband bridge does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Performance of a PCI-X to infiniband bridge, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Performance of a PCI-X to infiniband bridge will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3290811

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.