Electrical computers and digital processing systems: multicomput – Distributed data processing
Reexamination Certificate
2011-01-25
2011-01-25
Nguyen, Thu Ha T (Department: 2453)
Electrical computers and digital processing systems: multicomput
Distributed data processing
C709S208000, C370S230000, C370S231000, C370S235000, C712S001000, C718S102000, C718S108000, C719S319000
Reexamination Certificate
active
07877436
ABSTRACT:
A method and a data processing system for completing checkpoint processing of a distributed job with local tasks communicating with other remote tasks via a host fabric interface (HFI) and assigned HFI window. Each HFI window has a send count and a receive count, which tracks GSM messages that are sent from and received at the HFI window. When a checkpoint is initiated by a master task, each local task forwards the send count and the receive count to the master task. The master task sums the respective counts and then compares the totals to each other. When the send count total is equal to the receive count total, the tasks are permitted to continue processing. However, when the send count total is not equal to the receive count total, the master task notifies each task of the job to rollback to a previous checkpoint or kill the job execution.
REFERENCES:
patent: 6128672 (2000-10-01), Lindsley
patent: 6665758 (2003-12-01), Frazier et al.
patent: 6775719 (2004-08-01), Leitner et al.
patent: 7631128 (2009-12-01), Sgrosso et al.
patent: 2008/0069125 (2008-03-01), Reed et al.
Arimilli Lakshminarayana B.
Blackmore Robert S.
Kim Chulho
Rajamony Ramakrishnan
Xue Hanhong
Dillon & Yudell LLP
International Business Machines - Corporation
Nguyen Thu Ha T
LandOfFree
Mechanism to provide reliability through packet drop detection does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Mechanism to provide reliability through packet drop detection, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Mechanism to provide reliability through packet drop detection will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2619512