Electrical computers and digital processing systems: multicomput – Computer network managing – Computer network monitoring
Reexamination Certificate
1999-12-03
2003-12-02
Alam, Hosain T. (Department: 2155)
Electrical computers and digital processing systems: multicomput
Computer network managing
Computer network monitoring
C709S224000, C709S227000
Reexamination Certificate
active
06658469
ABSTRACT:
TECHNICAL FIELD
This invention relates generally to networked communications and, more particularly, relates to network communications between computer applications using different network transport providers.
BACKGROUND OF THE INVENTION
Computer networking allows applications residing on separate computers or devices to communicate with each other by passing data across the network connecting the computers. Traditional network media, such as Ethernet and ATM, are not reliable for application-to-application communication and provide only machine-to-machine datagram delivery service. In order to provide reliable application-to-application communication, transport protocol software run on the host machine must provide the missing functionality.
Typically, the protocol software for network communication is implemented as a combination of a kernel-mode driver and a user-mode library. All application communication passes through these components. As a result, application communication consumes a significant amount of the host processor's resources and incurs additional latency. Both of these effects degrade application communication performance. This degradation significantly limits the overall performance of communication intensive applications, such as distributed databases.
Recently, a new class of communication interconnects called System Area Networks (SANs) has emerged to address the performance requirements of communication intensive distributed applications. SANs provide very high bandwidth communication, multi-gigabytes per second, with very low latency. SANs differ from existing media, such as Gigabit Ethernet and ATM, because they implement reliable transport functionality directly in hardware. Each SAN network interface controller (NIC) exposes individual transport endpoint contexts and demultiplexes incoming packets accordingly. Each endpoint is usually represented by a set of memory-based queues and registers that are shared by the host processor and the NIC. Many SAN NICs permit these endpoint resources to be mapped directly into the address space of a user-mode process. This allows application processes to post messaging requests directly to the hardware. This design consumes very little of the host processor's resources and adds little latency to communication. As a result, SANs can deliver extremely good communication performance to applications.
Most distributed applications are designed to communicate using a specific transport protocol and a specific application programming interface (API). A large number of existing distributed applications are designed to utilize the Transmission Control Protocol/Internet Protocol (TCP/IP) suite and some variant of the Berkeley Sockets API, such as Windows Sockets.
In general, each SAN implementation utilizes a custom transport protocol with unique addressing formats, semantics, and capabilities. Often, the unique capabilities of a SAN are only exposed through a new communication API as well. Since existing applications are usually designed to use one primary transport protocol and API—most often TCP/IP and Sockets—there have been relatively few applications that can take advantage of the performance offered by SANs. In order for existing applications to use a SAN, the TCP/IP protocol software must currently be run on top of it, eliminating the performance benefits of this media.
In order to provide the performance benefit of SANs without requiring changes to application programs, a new component is inserted between the communication API used by the application, e.g. Windows Sockets, and a SAN transport provider. This new component (hereinafter network transport switch) emulates the behavior of the primary transport provider that the application was designed to utilize, e.g. TCP/IP, while actually utilizing a SAN transport provider to perform data transfer. In situations where the SAN transport provider is not suitable for carrying application communication, e.g. between sub-networks of an internetwork, the network transport switch continues to utilize the primary transport provider. A mechanism is provided within the switch for automatically determining whether to utilize the primary transport provider or alternative transport provider.
One example of this approach is described in a paper titled “SCI for Local Area Networks” by Stein Jorgen Ryan and Haakon Bryhni, ISBN 82-7368-180-7 (hereinafter SCILAN). Another example is described in a paper titled “High Performance Local Area Communication with Fast Sockets”, by Steven H. Rodrigues, Thomas E. Anderson, and David E. Culler, in Proceedings of Usenix Annual Technical Conference, 1997 (hereinafter Fast Sockets).
The SCILAN architecture provides for utilization of an alternative transport provider for communication between applications residing on computers systems connected to an SCI network. A known IP address range is assigned to the SCI network. If an application uses an address in this range to identify another application with which it would like to communicate, then the alternative transport provider is used. If an address is specified from a different range of the IP address space, then the standard TCP/IP provider is used. Note that in this architecture, the TCP/IP provider must use a separate physical network from the SCI network.
Fast Sockets also provides for utilization of an alternative transport provider for communication between applications residing on computer systems connected to a system area network. When an application tries to establish a connection, Fast Sockets applies a hash function to the destination TCP port address in order to obtain an alternative port address. Fast Sockets then tries to establish a connection to the alternative port address using TCP/IP. If this connection attempt succeeds, Fast Sockets uses the connection to negotiate a separate connection over the alternative transport provider. If the first connection attempt fails, Fast Sockets establishes a connection to the original port address supplied by the application using TCP/IP. When an application issues a request to listen for connections on a specific TCP port address, Fast Sockets applies the hash function to the address supplied by the application and then listens on both the requested port and the generated alternative port. This approach requires that two connection attempts be made regardless of whether TCP/IP is ultimately used to carry the application's data. This approach also overloads the TCP port address space and will fail if the alternative port address generated during a connection attempt is already in use by another application.
In order to emulate the data transfer behavior of the primary transport provider when utilizing an alternative transport provider, a network transport switch must implement a protocol that controls the transfer of data from source memory buffers supplied by a first application into destination memory buffers supplied by a second application. This aspect of data transfer is known as flow control.
The TCP/IP protocol provides for data transfer in the form of an unstructured stream of bytes. It is the responsibility of the applications using the TCP/IP protocol to encode the data stream to mark the boundaries of messages, records, or other structures. The Berkeley Sockets and Windows Sockets communication APIs offer applications a great deal of flexibility for receiving data. Applications may request to receive data directly into a specified memory buffer, request to receive a copy of a prefix of the data directly into a specified buffer without removing the original data from the byte stream (peek), or request to be notified when data is available to be received and only then request to receive the data or peek at it. Since TCP/IP provides an unstructured byte stream, an application may request to receive data from the stream into a specified memory buffer in any size portion, e.g. a single byte or thousands of bytes. The flexibility of these communication APIs and the unstructured nature of the TCP/IP data stream make it di
Eydelman Vadim
Forin Alessandro
Massa Michael T.
Morre Timothy M.
Zuberi Khawar M.
Alam Hosain T.
Microsoft Corporation
Tran Philip B.
LandOfFree
Method and system for switching between network transport... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and system for switching between network transport..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for switching between network transport... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3140527