Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability
Reexamination Certificate
2007-08-02
2010-11-09
McCarthy, Christopher S (Department: 2113)
Error detection/correction and fault detection/recovery
Data processing system error or fault handling
Reliability and availability
C370S406000, C370S241000
Reexamination Certificate
active
07831866
ABSTRACT:
Methods, apparatus, and products are disclosed for link failure detection in a parallel computer including compute nodes connected in a rectangular mesh network, each pair of adjacent compute nodes in the rectangular mesh network connected together using a pair of links, that includes: assigning each compute node to either a first group or a second group such that adjacent compute nodes in the rectangular mesh network are assigned to different groups; sending, by each of the compute nodes assigned to the first group, a first test message to each adjacent compute node assigned to the second group; determining, by each of the compute nodes assigned to the second group, whether the first test message was received from each adjacent compute node assigned to the first group; and notifying a user, by each of the compute nodes assigned to the second group, whether the first test message was received.
REFERENCES:
patent: 4245344 (1981-01-01), Richter
patent: 4634110 (1987-01-01), Julich et al.
patent: 4860201 (1989-08-01), Stolfo et al.
patent: 5333268 (1994-07-01), Douglas et al.
patent: 5918005 (1999-06-01), Moreno et al.
patent: 5941992 (1999-08-01), Croslin et al.
patent: 5953347 (1999-09-01), Wong et al.
patent: 5953530 (1999-09-01), Rishi et al.
patent: 6047122 (2000-04-01), Spiller
patent: 6449667 (2002-09-01), Ganmukhi et al.
patent: 6813240 (2004-11-01), Shah
patent: 6848062 (2005-01-01), Desai et al.
patent: 6880100 (2005-04-01), Mora et al.
patent: 6892329 (2005-05-01), Bruckman
patent: 6912196 (2005-06-01), Mahalingaiah
patent: 7007189 (2006-02-01), Lee et al.
patent: 7028225 (2006-04-01), Maso et al.
patent: 7080156 (2006-07-01), Lee et al.
patent: 7149920 (2006-12-01), Blumrich et al.
patent: 7200118 (2007-04-01), Bender et al.
patent: 7210088 (2007-04-01), Chen et al.
patent: 7289428 (2007-10-01), Chow et al.
patent: 7382734 (2008-06-01), Wakumoto et al.
patent: 7451340 (2008-11-01), Doshi et al.
patent: 7461236 (2008-12-01), Wentzlaff
patent: 7506197 (2009-03-01), Archer et al.
patent: 7529963 (2009-05-01), Archer et al.
patent: 7571345 (2009-08-01), Archer et al.
patent: 7600095 (2009-10-01), Archer et al.
patent: 7646721 (2010-01-01), Archer et al.
patent: 7669075 (2010-02-01), Archer et al.
patent: 2002/0152432 (2002-10-01), Fleming
patent: 2002/0188930 (2002-12-01), Moser et al.
patent: 2003/0061265 (2003-03-01), Maso et al.
patent: 2004/0078493 (2004-04-01), Blumrich et al.
patent: 2004/0103218 (2004-05-01), Blumrich et al.
patent: 2004/0181707 (2004-09-01), Fujibayashi
patent: 2004/0205237 (2004-10-01), Doshi et al.
patent: 2004/0223463 (2004-11-01), MacKiewich et al.
patent: 2005/0120273 (2005-06-01), Hudson et al.
patent: 2005/0131865 (2005-06-01), Jones et al.
patent: 2005/0246569 (2005-11-01), Ballew et al.
patent: 2005/0259587 (2005-11-01), Wakumoto et al.
patent: 2006/0179269 (2006-08-01), Archer et al.
patent: 2007/0174558 (2007-07-01), Jia et al.
patent: 2007/0245122 (2007-10-01), Archer et al.
patent: 2008/0253386 (2008-10-01), Barum
patent: 2008/0263320 (2008-10-01), Archer et al.
patent: 2008/0263329 (2008-10-01), Archer et al.
patent: 2008/0263387 (2008-10-01), Darrington et al.
patent: 2008/0270998 (2008-10-01), Zambrana
patent: 2008/0313506 (2008-12-01), Archer et al.
patent: 2009/0016332 (2009-01-01), Aoki et al.
N R Adiga et al., “An Overview of the BlueGene/L Supercomputer,” Supercomputing, ACM/IEEE 2002 Conference, Nov. 16-22, 2002, Piscataway, NJ.
Office Action Dated Nov. 12, 2008 in U.S. Appl. No. 11/279,573.
Office Action Dated Apr. 15, 2009 in U.S. Appl. No. 11/279,573.
Office Action Dated Nov. 12, 2008 in U.S. Appl. No. 11/279,579.
Office Action Dated Apr. 15, 2009 in U.S. Appl. No. 11/279,579.
Office Action Dated Jan. 9, 2009 in U.S. Appl. No. 11/279,586.
Office Action Dated Nov. 18, 2008 in U.S. Appl. No. 11/279,592.
Office Action Dated Apr. 29, 2009 in U.S. Appl. No. 11/279,592.
U.S. Appl. No. 11/279,573, filed Oct. 18, 2007, Archer, et al.
U.S. Appl. No. 11/360,346, filed Oct. 4, 2007, Gooding, et al.
U.S. Appl. No. 11/279,579, filed Nov. 8, 2007, Archer, et al.
U.S. Appl. No. 11/279,586, filed Oct. 18, 2007, Archer, et al.
U.S. Appl. No. 11/279,592, filed Oct. 18, 2007, Archer.
U.S. Appl. No. 11/737,229, filed Oct. 23, 2008, Archer, et al.
U.S. Appl. No. 11/832,940, filed Feb. 5, 2009, Archer, et al.
Final Office Action Dated Oct. 28, 2009 in U.S. Appl. No. 11/279,573.
Final Office Action Dated Sep. 4, 2009 in U.S. Appl. No. 11/279,579.
Notice of Allowance Dated Nov. 2, 2009 in U.S. Appl. No. 11/279,592.
Office Action Dated Mar. 8, 2010 in U.S. Appl. No. 11/360,346.
Notice of Allowance Dated Jun. 30, 2009 in U.S. Appl. No. 11/279,586.
Office Action Dated Mar. 29, 2010 in U.S. Appl. No. 11/832,940.
Stallman, Richard M. GDB Manual—The GNU Source-Level Debugger. [online] (Oct. 1989). Free Software Foundation, Inc., pp. 1-78. Retrieved From the Internet <http://www.cs.cmu.edu/afs/cs/usr/bovik/OldFiles/vax—u13/omega/usr/mach/doc/gdb.ps>.
Archer Charles J.
Blocksome Michael A.
Megerian Mark G.
Smith Brian E.
Biggers & Ohanian LLP
International Business Machines - Corporation
McCarthy Christopher S
Nock James R.
LandOfFree
Link failure detection in a parallel computer does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Link failure detection in a parallel computer, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Link failure detection in a parallel computer will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-4158975