Efficient non-contiguous I/O vector and strided data...

Electrical computers and digital processing systems: multicomput – Computer-to-computer protocol implementing – Computer-to-computer data framing

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

06389478

ABSTRACT:

CROSS-REFERENCE TO RELATED APPLICATIONS
Not applicable
COPYRIGHT NOTICE
A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
BACKGROUND OF THE INVENTION
1. Field of the Invention
The invention disclosed broadly relates to the field of high speed computers, and more particularly relates to the transfer of noncontiguous data blocks during a one-side communications between two or more computational nodes in distributed parallel computing machines.
2. Description of the Related Art
The introduction of highly parallel distributed multiprocessor systems such as the IBM RISC System/6000 Scalable POWERparallel (SP) systems provide high reliability and availability. These systems in their simplest form can be viewed as a plurality of uniprocessor and multiprocessor computer systems coupled together to function as one coordinated system through a local area network (LAN).
Data transfer between nodes of highly parallel distributed multiprocessor systems is necessary to enable truly scalable and efficient computing. Data transfer between nodes is broadly divided into two groups, contiguous and noncontiguous data transfer. Contiguous data that is stored in adjacent locations in a computer memory. In contrast, noncontiguous data is data that is not stored is collection of adjacent locations in a computer memory device. It is well known that the transfer of noncontiguous data requires more pipeline and supporting processor overhead than the transfer of contiguous data. The transfer of noncontiguous data block is also referred to as a transfer of I/O vectors.
Typically, there are two types of I/O vectors (i) general I/O vectors where each data block (or vector) can be a different length and (ii) strided I/O vectors where each data block (or vector) is a uniform length. Referring now to
FIG. 1
, show is the general I/O vector transfer. Shown are four data blocks
100
in strided I/O vector
110
. It is important to note that the starting addresses of the data blocks may not be symmetrically spaced as shown. Each of the four data blocks has a starting address a
0
, a
1
, a
2
, a
3
and a length
10
,
11
,
12
,
13
. The transfer of an I/O vector
110
with four data blocks
100
from an origin task
106
to a target task
108
.
Turning now to
FIG. 2
there is shown a block diagram of a strided I/O vector transfer. There are three data blocks
200
(or vector) are shown. Notice that the length or block size
204
of each data block
200
is uniform. Moreover, the stride size
202
or the distance in bytes between the beginning of one block (or vector) and the beginning of the next block (or vector) is uniform. The transfer of an I/O vector
210
with data blocks
200
from a source or origin task
206
to a target task
208
with the same block size and stride size is represented. In the general vector transfer, a number, N, of vectors on the source are transferred to a corresponding number of vectors on the target, in this example 3, where the length
204
of each vector transferred is the same as the length of the corresponding vector on the target task
208
. During a strided I/O vector transfer the following parameters are specified, the block size, the stride size, the number of vectors or blocks and the starting addresses of the first block on the source and the target.
The teaching of a centralized multiprocessor system, such as the system disclosed in the U.S. Pat. No. 5,640,534 issued on Jun. 18, 1997, assigned to Cray Research, with name inventors Douglas R. Beard et al. for a “Method and Apparatus for Chaining Vector Instructions,” does not address the problem with vector transfer on highly parallel distributed multiprocessor systems, such as the IBM SP. More specifically the teachings of the centralized multiprocessor systems do not address the problem on highly parallel distributed multiprocessor systems of the transfer of vector data during a one-side communications between two or more computational nodes (where each node itself can comprise two or more processors). A one-sided communications is a communications where the receiver is not expecting or waiting to receive vector data communications. This data transfer is not efficient and a need exists for optimized noncontiguous data transfer on distributed multiprocessor machines like the IBM SP. These systems allows users to write application programs that run on one or more processing node to transfer vector data in a one-sided communications style. These applications programs make use of a library of APIs (Application Programming Interfaces). An API is a functional interface that allows an application program written in a high level program such as C/C++ or Fortran to use these specified data transfer functions of I/O vectors without understanding the underlying detail. Therefore a need exists for a method and a system to provide I/O vector data transfer during a one-sided communications in a highly parallel distributed multiprocessor system.
If noncontiguous I/O vector data transfer capability is not available on a distributed multiprocessor machines an application requiring noncontiguous I/O vector data transfer incurs one of two overheads: (I) pipeling and (ii) copying. To transfer non-contiguous data, user in the application program must issue of series of API data transfers. However, the use of successive API data transfer results in LAN pipelining overhead. Alternatively, the application program can be designed to copy all the noncontiguous vector data into a contiguous data buffer before initiating a data transfer. This approach results in copy overheads. Those skilled in the art would know that for efficient noncontiguous data transfer the pipeline costs and the copy costs both must be avoided. An efficient trade-off is needed between the reduction of the number of data packets that are transferred over the network and a reduction of the copy overhead is required. Accordingly, a need exists to overcome these problems by providing an efficient transfer noncontiguous data during one-sided communications.
Still, another problem with noncontiguous data transfer during a one-side communications in a highly parallel distributed multiprocessor system is that efficient packaging of noncontiguous data into fixed packet sizes must be addressed. The packaging of noncontiguous data reduces the number of data packets that must be sent across the network. Typically, minimum state information of the I/O vector data should be maintained during the node-to-node transfer over the LAN. A spillover state is created during the packing of data into packets when the data not fitting into a predefined packet size is placed into spillover state. The creation and maintenance of a spillover state when packing data into packets is inefficient and should be avoided. Therefore a need exists for a method and apparatus to provide efficient noncontiguous data transfer in a one-sided communications while maintaining minimum state information without producing a spillover state. The spillover state becomes especially difficult to handle if the packet with spillover data is to be re-transmitted.
Still, another problem with noncontiguous data transfer during a one-side communications in a highly parallel distributed multiprocessor system is that a request to transfer data from a target node to a source node, in a get operation, must include a description of the source data layout to the target. The description of the source data layout is the list of address and length of data for each vector and the number vectors in the transmission. This need to send a description of the layout of source data to a target process includes control information that needs to be sent to the target and back to the source. Accordingly, a need exists to tran

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Efficient non-contiguous I/O vector and strided data... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Efficient non-contiguous I/O vector and strided data..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Efficient non-contiguous I/O vector and strided data... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2835950

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.