System and method for fault tolerance in multi-node system

Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C714S004110, C370S217000

Reexamination Certificate

active

06918063

ABSTRACT:
A method and system for promoting fault tolerance in a multi-node computing system that provides deadlock-free message routing in the presence of node and/or link faults using only two rounds and, thus, requiring only two virtual channels to ensure deadlock freedom. A lamb set of nodes for use in message routing is introduced, with each node in the lamb set being used only as points along message routes, and not for sending or receiving messages.

REFERENCES:
patent: 5371744 (1994-12-01), Campbell et al.
patent: 5435003 (1995-07-01), Chng et al.
patent: 5513313 (1996-04-01), Bruck et al.
patent: 5581689 (1996-12-01), Slominski et al.
patent: 5765015 (1998-06-01), Wilkinson et al.
patent: 5887127 (1999-03-01), Saito et al.
patent: 5963546 (1999-10-01), Shoji
patent: 6038688 (2000-03-01), Yoon
patent: 6104871 (2000-08-01), Badovinatz et al.
patent: 6130875 (2000-10-01), Doshi et al.
patent: 6202079 (2001-03-01), Banks
patent: 6230252 (2001-05-01), Passint et al.
patent: 6600719 (2003-07-01), Chaudhuri
patent: 6680915 (2004-01-01), Park et al.
patent: 6711407 (2004-03-01), Cornils
patent: 6760777 (2004-07-01), Agarwal et al.
patent: 6856627 (2005-02-01), Saleh et al.
patent: 6857026 (2005-02-01), Cain
patent: 6862263 (2005-03-01), Simmons
patent: 2002/0133620 (2002-09-01), Krause
PUBLICATION: “Message Routing in an Injured Hypercube”. Chen et al. Hypercube Concurrent Computers and Applications. Proceedings of the third conference on Hypercube concurrent computers and applications. vol. 7, pp. 312-317. Jan. 19-20, 1988.
PUBLICATION: “Origin-Based Fault-Tolerant Routing in the Mesh”. Libeskind-Hadas et al. High-Performance Computer Architecture, 1995. Proceedings, first IEEE Symposium. pp. 102-111. Jan. 22-25, 1995.
PUBLICATION: “Folded Pertersen Cube Networks: New Competitors for the Hypercubes”. Ohring et al. Parallel and Distributed Processing. Proceedings of the Fifth IEEE Symposium. pp. 582-589. Dec. 1-4, 1993.
PUBLICATION: “Computing in the RAIN: A Reliable Array of Independent Nodes”. Bohossian et al. Parallel and Distributed Systems, IEEE Transactions. vol. 12, pp. 99-114. Feb., 2001.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and method for fault tolerance in multi-node system does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System and method for fault tolerance in multi-node system, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for fault tolerance in multi-node system will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3425195

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.