System and method of adaptive message pipelining

Electrical computers and digital data processing systems: input/ – Input/output data processing – Input/output data buffering

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C710S030000, C710S120000, C709S250000

Reexamination Certificate

active

06308228

ABSTRACT:

TECHNICAL FIELD
The present invention relates to a technique for reducing network latency of messages while delivering high throughput.
BACKGROUND AND RELATED ART
Latency of messages in a network is typically linked to high end-to-end performance for applications. However, certain computer applications such as network memory or file access require low latency for large messages and high bandwidth or throughput under load in order to perform optimally. Reconciling these conflicting demands requires careful attention to data movement across data buses, network interfaces, and network links.
One method of achieving high data throughput is to send larger data packets and thus reduce per-packet overheads. On the other hand, a key technique for achieving low latency is to fragment data packets or messages and pipeline the fragments through the network, overlapping transfers on the network links and I/O buses. Since it is not possible to do both at once, messaging systems must select which strategy to use.
It is therefore desirable to automatically adapt a fragmentation policy along the continuum between low latency and high bandwidth based on the characteristics of system hardware, application behavior, and network traffic.
A number of prior art systems have used fragmentation/reassembly to reduce network latency on networks whose architecture utilizes a fixed delimited transmission unit such as ATM (asynchronous transfer mode). Fixed fragmentation is a common latency reduction scheme used in the following systems: APIC, Fast Messages (FM), Active Messages (AM), and Basic Interface for Parallelism (BIP). PM (to be discussed) appears to utilize a form of variable fragmentation only on packet transmission. Variable and hierarchical fragmentation were theoretically explored in Wang et. al. In contrast, a technique termed cut-through delivery was developed in the Trapeze Myrinet messaging system. Cut-through delivery is a non-static variable fragmentation scheme meaning fragment sizes can vary at each stage in a pipeline. The following prior-art discussion describes messaging software developed for a Myrinet gigabit network unless otherwise noted.
The design of the APIC (ATM Port Interconnect Controller) network interface card (NIC) specifies implementation of full AAL-5 segmentation and reassembly(SAR) on-chip. The APIC NIC uses fixed-size fragmentation at the cell granularity(48 byte of data), so it does not store and forward entire frames. Moreover, APIC does not adapt to host
etwork architectures or to changing conditions on the host or network. (See generally, Dittia et al.,
The APIC Approach to High Performance Network Interface Design: Protected DMA and Other Techniques, Proceedings of INFOCOM
'97, April, 1997)
Fast Messages (FM) utilizes fixed-size fragmentation when moving data between the host, NIC, and network link in order to lower the latency of large messages. Though FM uses a streaming interface that allows a programmer to manually pipeline transfers in variably sized fragments to and from host API buffers, API data moves to and from the network interface card in fixed size fragments. Thus, it is the programmer's task to pipeline packets by making multiple calls to the application programming interface (API). FM lacks the ability to adapt automatically and transparently to changing host and network characteristics. (See generally, Lauria et al.,
Efficient Layering for High Speed Communication: Fast Messages
2.x,
IEEE
, July, 1998)
Active Messages (AM) uses a fixed-size fragmentation scheme to reduce latency of medium to large packets. Active messages, however, is non-adaptive and utilizes store and forward for non-bulk packets as a means for increasing throughput.
Basic Interface for Parallelism (BIP) performs static fixed-size fragmentation on the adapter. BIP, however, adjusts the fragment size depending on the size of the entire packet. When a packet is sent, fragment size is determined by a table look-up as indexed by the packet's length. BIP, while statically adaptive to packet size, does not adjust dynamically to changing host and network characteristics. (See generally, Prylli et al.,
Modeling of a High Speed Network Thrughput Performance: The Experience of BIP over Myrinet
, September 1997)
The Real World Computing Partnership has developed a messaging package, also for Myrinet, called PM which implements fragmentation on the adapter for sending in a technique they term immediate sending. Double buffering is used for receiving. It is unclear from their current documents exactly what form of fragmentation constitutes immediate sending, but it appears to be a form of variable fragmentation. Moreover, their technique is limited since PM claims it is not possible to perform immediate sending on the reception of a packet. (See generally, Tezuka et al.,
PM: An Operating System Coordinated High Performance Communication Library, Real World Computing Partnership
) In a theoretical approach, Wang et al. examines variable sized and hierarchical fragmentation pipelining strategies. Hierarchical fragmentation is one scheme in which a fragmentation schedule may change in different pipeline stages; it is not a static pipelining method. The theory rests on different assumptions than the present invention, adaptive message pipelining (AMP). Wang et al. assumes that g
i
(fixed transfer overhead) and G
i
(time per unit of data) values are fixed and previously known, so that both static and non-static pipeline schedules can be computed beforehand, and therefore are not adaptable to changing conditions. Neither does Wang et al. consider throughput as a goal in any of their studied pipelining strategies. (See generally,
Modeling and Optimzing Communication Pipelines, Proceedings of ACM International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS
, June 1998)
Cut-through delivery, disclosed in a previously published paper, is a variable sized fragmentation scheme in which the schedules are not static for any stage of a pipeline. Cut through delivery alone, however, is unable to adjust to inter-packet pipelining and therefore cannot extract the maximum bandwidth from the underlying system hardware. (Yocum et al.,
Cut
-
Through Delivery in Trapeze: An Exercise in Low
-
Latency Messaging, Proc. Of Sixth IEEE International Symposium on High Performance Distributed Computing
, August, 1997)
The approach of the present invention differs from the prior-art in several ways. First, pipelining is implemented on the network interface card (NIC), transparently to the hosts and host network software. Second, selection of transfer sizes at each stage is automatic, dynamic, and adaptive to congestion conditions encountered within the pipeline. The schedules, therefore, are variable and non-static. Third, the user of the API does not need to know anything about the hardware or network characteristics or load in order to achieve both low-latency and high bandwidth.
SUMMARY OF THE INVENTION
The present invention describes adaptive message pipelining (AMP) which is a scheme that reduces network latency of messages larger than minpulse, a pre-defined threshold amount of data, while delivering maximum throughput of high-speed networks and I/O buses under load. Adaptive message pipelining for the present invention is supported in part by the previously discussed local policy called cut-through delivery implemented within a network interface card (NIC). AMP, as opposed to ordinary cut-through delivery, automatically adjusts to congestion, yielding peak throughput for streams of messages without the need for a separate bulk data transfer mechanism.
The present invention's reference implementation was coded as a simple policy in firmware running on a Myrinet network adapter. This local policy, which will be referred to as AMP, combines low network latency with high throughput through careful pipelining of the movement of data between network links and host memory.
Adaptive message pipelining improves network performance by managing the data transfers invo

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and method of adaptive message pipelining does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System and method of adaptive message pipelining, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method of adaptive message pipelining will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2559489

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.