Distributed multi-fabric interconnect

Electrical computers and digital data processing systems: input/ – Intrasystem connection – Bus interface architecture

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

06473827

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates in general to computer systems, and in particular, to a distributed multi-fabric interconnect for massively parallel processing computer systems.
2. Description of Related Art
An interconnection network is the key element in a Massively Parallel Processing (MPP) system that distinguishes the system from other types of computers. An interconnection network, or just interconnect, refers to the collection of hardware and software that form the subsystem through which the processors communicate with each other.
An interconnect is comprised of Processor/Network (P/N) interfaces and one or more switching fabrics. A switching fabric comprises a collection of, switching elements, or switches, and links. Each switching element contains a minimum of three I/O ports: two or more inputs and one or more outputs, or one or more inputs and two or more outputs. Said element also contains a means for dynamically establishing arbitrary connections between inputs and outputs under the control of a routing mechanism. Each link establishes a permanent connection between the output of one switching element (or P/N interface) and the input of another. The pattern of connections formed by links and switches define the topology of the fabric.
Practical implementations favor modularity. Hence, typical switching elements have equal numbers of inputs and outputs, fabrics exhibit regular geometric (mathematically definable) topologies, and multiple fabrics in an interconnect are usually identical. For reasons of performance, switches typically have a crossbar construction in which all outputs can be simultaneously connected to different inputs.
The performance of the interconnect is either limited by the speed of the links between the switches or the speed of the switches themselves. Current semiconductor technology limits the speed of the links and the physical distance between the switching elements. The speed of the switches is limited by semiconductor technology and the complexity of the design.
One means to overcome these speed limitations is to increase the number of fabrics in the interconnect. This multiplies bandwidth and has the benefit of providing multiple paths between every pair of end points. Ordinarily, this approach would expand the physical size of a given implementation, increase the number of cables, and increase the cost. It would also require more I/O ports in each processor, which may not be available. Perhaps most importantly, the interface software may not be designed to utilize multiple fabrics, and depending on the implementation, the software may or may not be readily modified to accommodate such a change.
The scalability of the MPP system is also an important characteristic. Not only must connectivity scale, but performance must scale linearly as well. The MPP system size demanded by customers can vary from two to 1024 or more processing nodes, where each node may contain one or more processors. It is essential that the interconnect be able to grow in size incrementally. It is undesirable but common for MPP interconnects to double in size to accommodate the addition of one processing node as the total number of ports required crosses powers of two (e.g., an interconnect with 128 ports is required to support 65 processing nodes, which is at least twice as much hardware as 64 nodes require, depending on the topology used).
Another problem with MPP systems results from the commoditization of processor hardware. Computer system manufacturers no longer design all the elements of the systems they produce. In particular, MPP systems are typically comprised of large collections of processor/memory subsystems made by other manufacturers. Access to the processor is limited to the provided I/O bus, and it is generally no longer possible to gain access via the processor/memory bus. The I/O bus typically operates at a fraction of the speed of the processor/memory bus; however, multiple I/O busses are often provided. This situation favors interconnects that exploit parallelism rather than single, very high bandwidth interconnects.
There are two basic approaches that have been used in prior designs of MPP systems. The first is centralized, in which all switching fabric hardware is housed in one physical location. Cables must be run from the P/N interface in each processing node to each fabric in the interconnect. In cases where there is more than one fabric, usually for providing fault tolerance, each fabric is centralized with respect to the processing nodes and independent of the other. Providing more fabrics using this arrangement multiplies all the hardware, cables and cost.
The other approach is distributed, in which portions of the switching fabric are physically distributed among the processing nodes. An example of this is the Y-Net interconnect used in the Teradata™ DBC 1012 and NCR™ 3600 systems. This is also a popular arrangement for mesh and hypercube interconnects.
If the fabric is replicated for fault tolerance, each of the individual submodules and cables are duplicated. Since the packaging typically allocates a fixed amount of space for the portion of the fabric that coexists with each processing node, replicating fabrics to increase performance requires a redesign of the system packaging. In the case of typical mesh and hypercube interconnects, one switch is an integral part of the processor electronics, and is often co-located on the same board. Replicating the fabric is completely impractical, requiring the redesign of boards and packaging.
Thus, there is a need in the art for designs that improve performance through fabric replication in a cost-effective manner. There is also a need in the art for designs that lead to reduction of the cable count in MPP systems, and also eases the installation effort. Finally, there is a need in the art for designs that distribute the implementation of the interconnect, so that the switching hardware can consume otherwise unused space, power, and cooling resources by being co-located with processor hardware.
SUMMARY OF THE INVENTION
To overcome the limitations in the prior art described above, and to overcome other limitations that will become apparent upon reading and understanding the present specification, the present invention discloses an interconnect network having a plurality of identical fabrics that partitions the switching elements of the fabrics, so that many links can be combined into single cables. In the partition, one or more of the switching elements from the first stage of each of the fabrics are physically packaged on to the same board called a concentrator, and these concentrators are physically distributed among the processing nodes connected to the interconnect network. The concentrator allows all the links from each processing node to a concentrator, each of which need to be connected to different fabrics, to be combined into a single cable. Furthermore, the concentrator allows all the links from a single switching element in the first stage to be combined into a single cable to be connected to the subsequent or expansion (second and higher) stages of the fabric. The subsequent or expansion stages of each fabric can be implemented independently of other fabrics in a centralized location.


REFERENCES:
patent: 4543630 (1985-09-01), Neches
patent: 4925311 (1990-05-01), Neches et al.
patent: 4939509 (1990-07-01), Bartholomew et al.
patent: 5065394 (1991-11-01), Zhang
patent: 5303383 (1994-04-01), Neches et al.
patent: 5313649 (1994-05-01), Hsu et al.
patent: 5321813 (1994-06-01), McMillen et al.
patent: 5559970 (1996-09-01), Sharma
patent: 5875314 (1999-02-01), Edholm
patent: 6105122 (2000-08-01), Muller et al.
patent: 6138185 (2000-10-01), Nelson et al.
patent: 6223242 (2001-04-01), Sheafor et al.
patent: 6304568 (2001-10-01), Kim
patent: 6343067 (2002-01-01), Drottar et al.
patent: 2002/0007464 (2002-01-01), Fung
Cariño Jr., Felipe et al., “Industrial Database Supercomputer Exegesis: The DBC/1012, The NCR 3700, The Ynet, and The Bynet,” Terad

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Distributed multi-fabric interconnect does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Distributed multi-fabric interconnect, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Distributed multi-fabric interconnect will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2991768

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.