Multicasting using a wormhole routing switching element

Multiplex communications – Pathfinding or routing – Switching a message which includes an address header

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C370S428000, C370S432000

Reexamination Certificate

active

06542502

ABSTRACT:

FIELD OF THE INVENTION
The invention relates to the routing of digital electronic data through network switches.
BACKGROUND OF THE INVENTION
The Problem: Wormhole Routing Multicast Methods Deadlock
In wormhole routing, flow control is performed on units that are smaller than packets: flow-control digits, or “flits”.
FIG. 1
shows a packet broken down into flits: flit
1
, flit
2
, flit
3
and last flit.
The header (first flit) of the packet advances immediately through each switching element (switch) unless it is blocked because of contention for an output port, and succeeding flits of the packet advance in pipelined fashion behind the header. This immediate forwarding minimizes the latency per switch. When the packet header is blocked, all flits of the packet are buffered in place until the output port is free. Thus, a single blocked packet may be blocked in place across many switches.
One prior approach for multidestination packet based multicast on unidirectional MINs uses strict wormhole routing. There are two general approaches to replicating a wormhole packet at a switch: synchronous and asynchronous.
Under synchronous replication, a multidestination packet's flit is forwarded (and simultaneously replicated) only if all the required output ports at that stage are free. Note that the required output ports at a stage may belong to more than one switch. If one or more of the required ports at the stage are busy, the packet is blocked in place without replication; replication of the flit to the output ports is done only when all the required output ports become free. The required output ports that are free are reserved although no flit is transmitted to them until the other output ports become free. Thus the various copies of a packet's flit travel together from one stage to the next.
In contrast, under asynchronous replication, a multidestination packet's flit is forwarded to all the required output ports that are free when the flit arrives at a stage in the network. Thus, copies of a packet's flit may travel from one stage of the network to the next at different times. However, the flit cannot be discarded until all the required downstream output ports have received their respective copies of the flit.
If we consider a system adopting strict (pure) wormhole routing consisting of switches that have input buffers of size
1
flit, asynchronous replication does not prove very beneficial since the packet's next flit will be blocked until the input buffer at the switch becomes free (and the input buffer becomes free only when the required but busy output ports become free). So the only benefit that asynchronous replication offers us in such a system over synchronous replication, is that a single flit can be forwarded on the output ports that have been successfully reserved by the packet. If the input buffer is of size f flits, using strict wormhole routing and asynchronous replication, up to f flits may be transmitted to the output ports that the packet has reserved before the packet blocks because of the required but busy output ports. Prior work has shown that hardware tree-based synchronous replication in networks adopting strict wormhole routing leads to deadlock, but suggested solutions to this have been extremely restrictive and inappropriate for variations of wormhole routing that provide more intermediate buffering.
The essential reason that wormhole methods deadlock is that the progress made by each output port at a replicating switch is dependent upon the progress of every other output port participating in the replication. If one output port is blocked or is currently sending another packet, then the flits to be sent by that output port must remain in the input port buffer, blocking subsequent flits from entering the input port. Therefore, free output ports are blocked by busy output ports. Two multicasts can easily block each other. Multicast A could block multicast B in one switch, while simultaneously multicast B is blocking multicast A another switch.
If the entire packet could be buffered at the input port, it would be possible for unblocked output ports to receive and transmit all flits of the packet, and this would decouple the dependence between output ports for this packet. Virtual cut-through (VCT) flow-control provides this guarantee. VCT allows the same low-latency pipelining as wormhole routing, but for VCT a switch only accepts a new packet when that switch can guarantee buffer space for the entire packet.
SP
2
Review Buffered Wormhole Routing
The buffered wormhole routing used in IBM's SP
2
is a variation of wormhole routing wherein every switch in the network is equipped with a central buffer, as illustrated in FIG.
2
.
When packets are blocked at a switch due to a busy output port, the switch attempts to store the packet in this central buffer, thus freeing the links held by the trailing packet flits. There may be enough space in the central buffer to store the entire packet. However, there is no guarantee that a packet arriving at a switch will find enough space in the central buffer to be completely stored. If the central buffer does not have adequate space to store the entire blocked packet, as many as possible of the packet flits are stored in the central buffer and the remainder of the packet is blocked in place. Note that in the absence of contention, packets may propagate through the network just as in a purely wormhole routed network, and the central buffers will remain empty. Alternately, a switch could be configured to force each packet though the central buffer, even when a packet encounters no contention.
Because there is no assurance that the central buffer can store an entire multidestination packet, the central buffer as implemented in SP
2
cannot guarantee to prevent multicast deadlock. However, an SP
2
-like shared central buffer is an extremely attractive resource for packet replication. We will describe improvements to the basic central buffer free-space logic that are similar to virtual cut-through operation. Specifically, these improvements guarantee that any packet admitted to the central buffer can (eventually) be entirely stored. This guarantee effectively decouples the interdependence of the replicated output packets at a switch, eliminating the cause of multicast wormhole routing deadlock.
In the SP
2
buffered wormhole implementation of the invention, the central buffer is constructed so as to effectively form a separate FIFO queue of packets for each output port. Each input port can write flits into the buffer, and each output port can read flits. Central buffer space is dynamically allocated to requesting input ports in a fair manner.
A number of flits are buffered into a chunk before being written into the central buffer, and chunks are read from the central buffer before being disassembled into flits again at the reading output port. This reduces the number of central buffer RAM read and write ports required. As an example, in the 8-ported SP
2
routing elements, up to 1 flit is received or transmitted at each input port or output port every cycle. An SP
2
chunk is 8 flits, and thus the central buffer only requires 1 RAM write port and 1 RAM read port to match the input and output bandwidth of the switch. The central buffer logic maintains a list of free chunk locations. A central buffer write allocates a free chunk, and a read returns a free chunk.
There must be a mechanism—we will call it the next-packet list—to order the packets within each packet queue. Each packet is divided into chunks, and thus there is also a mechanism—the next-chunk list—to order the chunks within a packet. First we describe the next-packet lists.
To record the next-packet linking information, a pointer field is available for each chunk of data: the next-packet (NP[ ]) field. In addition, each output port o maintains first-packet (firstP[o]) and last-packet (lastP[o]) pointers into its packet queue. For this description, all pointers are assumed to be nil when invalid. In the fol

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Multicasting using a wormhole routing switching element does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Multicasting using a wormhole routing switching element, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Multicasting using a wormhole routing switching element will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3057302

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.