Electrical computers and digital processing systems: multicomput – Computer-to-computer data routing
Reexamination Certificate
2001-08-01
2004-11-30
Lane, Jack (Department: 2188)
Electrical computers and digital processing systems: multicomput
Computer-to-computer data routing
C709S232000, C709S213000, C709S216000
Reexamination Certificate
active
06826622
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention concerns to methods of communicating data between computers in a computer system having a plurality of computers or data processing equipment connected through a communication network. More particularly, the invention consists of a method of sending/receiving data between memories of computers on a network in a which the hardware has the capability of transferring data between the memories of these computers.
2. Description of the Related Art
The TCP/IP protocol is used in the overwhelming majority of communications between computers, in particular in the communications in the Internet or in intranets. Since TCP/IP processing is not executed by the application, but is executed by the operating system, in order that the application perform communication using TCP/IP it uses an API (Application Programming Interface: the set of functions which an application calls in order to use a certain function of a computer or an operating system) called “Sockets API” (refer to the book by W. Richard Stevens, “UNIX Network Programming”, Prentice Hill, U.S.A., 1990, ISBN 0-13-949876-1).
An example of the software structure of a host which performs communication using the TCP/IP protocol is shown in FIG. 
1
. The host 
10
 performs communication using the network 
18
. The kernel 
120
 of the operating system of the host 
10
 executes protocol processing 
121
 of TCP/IP and controls the communication hardware 
11
 in order to perform communication. The program 
101
 of the application 
100
 uses the Sockets API 
90
 to call the library 
110
. The library executes the system call 
111
 and calls the kernel 
120
. The kernel 
120
 sends and receives data 
102
 of the application 
100
 through the socket buffer 
122
.
Since protocol processing 
121
 in TCP/IP communication involves a large amount of processing, and the system call 
111
 and the copy between the data 
102
 and the socket buffer 
122
 result in overhead, these processings limit the communication performance in some cases. For this reason, computer systems requiring high communication performance, such as supercomputers or workstation clusters, employ networks which can transfer data between applications without performing protocol processing, system calls and data copies and also bypassing the kernel. In the present specification, henceforth, this communication method will be referred to as “high-speed communication” for short, when applicable. As an example of high-speed communication, there is the VIA (refer to the specification by Compaq Computer Corp., Intel Corp., Microsoft Corp., “Virtual Interface Architecture Specification, Draft Revision 1.0”, Dec. 4, 1997, http://www.Viarch.org). Since the functionality of high-speed communication is different from that of TCP/IP, their respective APIs are also different.
An example of the software structure of a host employing high-speed communication is shown in FIG. 
2
. The program 
104
 of the application 
103
 calls the high-speed communication library 
130
 by using the high-speed communication API 
91
 to send and receive data 
105
. By executing the communication processing 
131
 of the high-speed communication library 
130
, the high-speed communication hardware 
12
 is activated bypassing the kernel 
120
 to send and receive the data 
105
 through the high-speed communication network 
19
. When sending and receiving data by high-speed communication, two processings are required: the processing of inspecting whether or not the application 
103
 has the permission to access the data 
105
 which it wants to send or receive, and the processing to convert the virtual addresses which were specified by the application 
103
 into the physical addresses which are used by the high-speed communication hardware 
12
. For this reason the application 
103
, before sending and receiving data, calls the high-speed communication library 
130
 to register the data 
105
 to be sent and received (the registered data is shown in the form of a rectangle having rounded corners). The kernel performs the registration processing 
123
 in response to the call 
132
 of the high-speed communication library. As a result, it is possible to verify if the application 
103
 has access permission and, when it has the address conversion is performed and its result is registered in the memory registration table 
13
. The high-speed communication hardware 
12
 performs both the verification of the access permissions and the address conversion by using this memory registration table 
13
.
Since the high-speed communication API 
91
 is different from the Sockets API 
90
, in order that an application 
100
 employing the Sockets API 
90
 may use high-speed communication, this application 
100
 must be rewritten to use the high-speed communication API 
91
. Since this rewriting is difficult to do, many applications will remain unchanged, still using the Sockets API, thus they won't be able to take advantage of the high performance of high-speed communication. In order to solve this problem, a communication method called “Fast Sockets”, shown in 
FIG. 3
, is employed. The Fast Sockets library 
140
 receives the call made from the application 
100
 through the sockets API 
90
 to execute the emulation processing 
141
 to communicate using high-speed communication. For this reason, it is possible to take advantage of the high performance of high-speed communication while keeping application compatibility. As examples of Fast Sockets, there is the method disclosed in JP-A-11-328134, the method by Berkely University (refer to the paper by S. H. Rodrigues, T. E. Anderson, D. E. Culler, “High-Performance Local Area Communication With Fast Sockets”, Proceedings of the USENIX'97, 1997, pp. 257 to 274), the method by Shah et al. (refer to the paper by H. V. Shah, C. Pu, R. S. Madukkarumukumana, “High Performance Sockets and RPC over Virtual Interface (VI) Architecture”, Proceedings of CANPC'9, 1991), Winsock Direct made by Microsoft Corp. (refer to the article “Winsock Direct Specifications, on the Microsoft Windows Driver Development Kit (DDK)”.
When data 
102
 of the application 
100
 is registered (
800
) to perform communication, a processing overhead (
132
, 
123
) of the buffer registration 
800
 occurs. When the data length is long, this overhead (
132
, 
123
) is shorter than the communication time, so high communication performance is obtained. On the other hand, when the data length is short, this overhead is longer than the communication time, so the communication performance is reduced. In order to solve this problem, the Fast Sockets library 
140
 on its initialization allocates a pre-allocated buffer 
142
 and registers (
801
). When communicating short data 
102
, this data is not registered, but is copied to the pre-allocated buffer 
142
 to perform the communication. In this case, despite the overhead of the copy, since the data length is short, and this overhead is small when compared to the registration processing, high performance can be obtained. While the pre-allocated buffer 
142
 is usually separated into buffers for sending and buffers for receiving data, these buffers are collectively shown in the form of one buffer 
142
 in FIG. 
3
 and the following figures of the software structure.
Above, the TCP/IP communication and the Fast Sockets have been described. While applications generally use TCP/IP communication (and as a result, the Sockets API), scientific computing applications use APIs such as MPI (Message Passing Interface Forum: refer to the standard “MPI: A Message-Passing Interface Standard”, 1995). Since MPI is independent of the computer architecture, when implementing MPI over high-speed communication, the calls made to the MPI API are mapped onto the calls of the high-speed communication API 
91
. As an example of a product implementing this mapping, there is MPI-Pro made by MPI Software Technology Inc. (refer to the paper by R. Dimitrov and A. Skjellum., “Efficient MPI for Virtual Interface (VI) Arch
A. Marquez, Esq. Juan Carlos
Fisher Esq. Stanley P.
Hitachi , Ltd.
Lane Jack
Reed Smith LLP
LandOfFree
Method of transferring data between memories of computers does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method of transferring data between memories of computers, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method of transferring data between memories of computers will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3363201