Large-scale cluster monitoring system, and method of...

Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C714S004110, C714S004120, C714S004210, C714S013000, C714S047100

Reexamination Certificate

active

08006124

ABSTRACT:
Provided are a large-scale cluster monitoring system and a method for automatically building/restoring the same, which can automatically build a large-scale monitoring system and can automatically build a monitoring environment when a failure occurs in nodes. The large-scale cluster monitoring system includes a CM server, a BD server, GM nodes, NA nodes, and a DB agent. The CM server manages nodes in a large-scale cluster system. The DB server stores monitoring information that is state information of nodes in groups. The GM nodes respectively collect the monitoring information that is the state information of the nodes in the corresponding groups to store the collected monitoring information in the DB server. The NA nodes access the CM server to obtain GM node information and respectively collect the state information of the nodes in the corresponding groups to transfer the collected state information to the corresponding GM nodes. The DB agent monitors the monitoring information of the nodes in the groups, which is stored in the DB server, to detect a possible node failure.

REFERENCES:
patent: 6088727 (2000-07-01), Hosokawa et al.
patent: 6594786 (2003-07-01), Connelly et al.
patent: 6718486 (2004-04-01), Roselli et al.
patent: 6983317 (2006-01-01), Bishop et al.
patent: 7287180 (2007-10-01), Chen et al.
patent: 7447940 (2008-11-01), Peddada
patent: 7480816 (2009-01-01), Mortazavi et al.
patent: 2007/0206611 (2007-09-01), Shokri et al.
patent: 2008/0201470 (2008-08-01), Sayama
patent: 2003-0051930 (2003-06-01), None
patent: 1020050066133 (2005-06-01), None
Xue et al. “AOCMS: An Adaptive and Scalable Monitoring System For LArge-Scale Clusters.” Proc. of the 2006 IEEE Asia-Pacific Conf on Services Computing. Dec. 2006.
Park et al. “The Cluster Monitoring and Controlling Method with Scalable Communication Framework.” Proc of the Eighth Intl Conf on High-Performance Computing in Asia-Pacific Region. 2005.
Matthew L. Massie et al., “The ganglia distributed monitoring system: design, implementation, and experience”, Parallel Computing 30 (2004), pp. 817-840.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Large-scale cluster monitoring system, and method of... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Large-scale cluster monitoring system, and method of..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Large-scale cluster monitoring system, and method of... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2665748

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.