Multiplex communications – Pathfinding or routing – Switching a message which includes an address header
Reexamination Certificate
1998-06-24
2002-06-18
Chin, Wellington (Department: 2664)
Multiplex communications
Pathfinding or routing
Switching a message which includes an address header
C370S476000
Reexamination Certificate
active
06408002
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates to the movement of data and control information between nodes in a massively parallel processing (MPP) system, wherein each node represents an independent, concurrently operating computer system. More specifically, the present invention relates to the detection, isolation and resolution of error conditions that may occur at the interconnection between the various nodes in the MPP system, so as to reduce the likelihood that the MPP system, or a portion thereof, will lock-up and be precluded from transmitting message packets between the various nodes.
BACKGROUND
MPP systems are, in general, large-scale computer systems that comprise numerous, often hundreds, of individual, concurrently computing entities. The computing entities communicate with one another through a network of corresponding nodes linked together by communication channels. The network is often referred to as a fabric. As one skilled in the art will recognize, the network or fabric can be configured in any one of a number of different topologies.
One rather typical topology, in accordance with prior art, is the rectangular mesh. An example of a 2×4 rectangular mesh
100
is illustrated in FIG.
1
. As shown in
FIG. 1
, the 2×4 rectangular mesh
100
is essentially a two-dimensional network of nodes
105
which are connected by communication channels
110
. Although it is not depicted in
FIG. 1
, each of the nodes
105
is connected to at least one computing entity. In addition, each of the nodes
105
may have as many as three neighboring nodes.
In accordance with the prior art topology of
FIG. 1
, data and/or control information is transported from one computing entity to another through the various nodes
105
and communication channels
110
, in accordance with a routing protocol. For example, the computing entity at node (
0
,
0
) may require data that is stored in the computing entity at node (
3
,
1
). In order to obtain that data, the computing entity at node (
0
,
0
) sends a message packet to the computing entity at node (
3
,
1
) requesting the desired data. The computing entity at node (
3
,
1
) responds by transmitting a message packet back to the computing entity at node (
0
,
0
) wherein the message packet contains the requested data. In this example, each message packet traverses three intermediate nodes in order to travel from its source node to its destination node.
Another well-known topology is the TORUS. An example of a two-dimensional 2×4 TORUS
200
is shown in FIG.
2
. Like the 2×4 rectangular mesh
100
, the various nodes
205
are interconnected by communication channels
210
, wherein each of the nodes
205
connect to at least one computing entity. However, in contrast with the 2×4 rectangular mesh
100
, the outside edges of the TORUS wrap around, as illustrated in FIG.
2
. For example, the left outside edge
215
of the node (
0
,
0
) wraps around to connect with the right outside edge
220
of the node (
3
,
0
), while the bottom outside edge
225
of the node (
0
,
0
) wraps around to connect with the top outside edge
230
of the node (
0
,
1
). Therefore, each node in the 2×4 TORUS
200
, in contrast with each node in the 2×4 rectangle mesh
100
, has four neighboring nodes.
The advantage of the TORUS topology over the rectangular mesh topology, as one skilled in the art will understand, is that when transmitting a message packet from a source node to a destination node, the message packet, on average, travels through fewer intermediate nodes, thereby reducing message packet latency. This, in turn, results in higher overall through-put in the fabric. For purposes of illustration, if the computing entity at node (
3
,
1
) of the 2×4 TORUS
200
transmits a message packet to the computing entity at node (
0
,
0
), the message packet need only traverse one intermediate node. It should be readily apparent that the difference between the average number of intermediate nodes traversed in a rectangular mesh topology versus a TORUS topology becomes more exaggerated as the number of nodes increases.
Although
FIG. 2
illustrates a two-dimensional TORUS topology, MPP systems are commonly configured as a three-dimensional TORUS. A three-dimensional mesh TORUS topology
300
is illustrated in FIG.
3
.
It should also be readily apparent, that in traversing a network, or fabric, from a source node to a destination node, a message packet may take any one of a number of different routes. However, each message packet has a header portion which includes, among other things, an address field. The address field contains information which governs a specific route for the message packet. For example, if the reference number associated with each of the nodes
205
in the 2×4 TORUS
200
in
FIG. 2
represents a Cartesian coordinate X and Y, a message packet traveling from the node (
0
,
0
) to the node (
3
,
1
) might be routed as follows: −
1
X to the node (
3
,
0
), then +1Y to the node (
3
,
1
). Alternatively, the message packet might be routed as follows: +2X to the node (
2
,
0
), then +1Y to the node (
2
,
1
), then +1X to the node (
3
,
1
).
To manage and control the flow of message packets within a network or fabric, and to avoid undesirable routing conditions such as “deadlock”, MPP systems employ routers or routing elements. Routing elements employed in conjunction with TORUS topologies can be referred to as TORUS routing elements or TROUTS. Generally, there is a routing element or TROUT associated with each node in the fabric, such that each message packet actually traverses the network or fabric from routing element to routing element until the message packet reaches its destination node. Once the message packet arrives at its destination node, the routing element at the destination node removes any overhead and/or control fields from the message packet and transfers the remaining portion of the message packet to the computing entity that corresponds with the destination node. Typically, the message packet is transferred through a computing entity interface device.
Routing elements employed in conjunction with MPP systems are generally well-known in the art. For example, U.S. Pat. No. 5,105,424 describes a system where message packets are routed along pathways from one computing entity to another, wherein each computing entity has a corresponding routing automation. Each routing automation has an input for receiving message packets and a plurality of outputs which are selectively chosen based on routing instructions embedded in the header of each message packet. Each routing automation also includes logic means for reading the routing instructions and for updating the routing information to reflect the passage of the message packet through each automation.
U.S. Pat. No. 4,933,933 describes a TORUS routing chip which employs two virtual channels between each routing element. The virtual channels are implemented by transmitting more than one message packet on the same physical connection using a time division, multiple access (TDMA) scheme.
The routing elements described in the above-identified and other publications basically provide message packet routing schemes. They do not, however, provide any notable error handling capabilities, despite the fact that error conditions are often fatal, thereby rendering an entire portion of the network or fabric, if not the entire MPP system, inoperative.
“Deadlock” is an example of a message packet routing condition that is generally fatal. Deadlock occurs when a single message packet wraps around the fabric onto itself, thereby blocking its own progress. Deadlock can also occur when two or more message packets block each other. Virtual channels are typically used for preventing deadlock. In prior designs, virtual channels are implemented using a standard time division multiple access (TDMA) scheme. In a standard TDMA scheme, each virtual channel is assigned a corresponding time slot, such that data and control words
Moll Jeffery L.
Myers Mark S.
Quattromani Marc Alan
Burns Doane , Swecker, Mathis LLP
Chin Wellington
Fujitsu Siemens Computers LLC
Pham Brenda
LandOfFree
Torus routing element error handling and self-clearing with... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Torus routing element error handling and self-clearing with..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Torus routing element error handling and self-clearing with... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2939325