Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability
Reexamination Certificate
2001-02-27
2004-06-15
Baderman, Scott (Department: 2184)
Error detection/correction and fault detection/recovery
Data processing system error or fault handling
Reliability and availability
C714S038110
Reexamination Certificate
active
06751753
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a method, system, and program for monitoring system components.
2. Description of the Related Art
Prior art devices provide a monitoring program to monitor the operation of a system. For instance, the Sun Microsystems, Inc. (“SUN”) StorEdge Enclosure Manager provides management and monitoring of a SUN A5x00 storage subsystem.** The StorEdge Enclosure Manager provides alarm notification and remote reporting (via email, files, and system logging) upon detection of abnormal activities or conditions within a designated storage enclosure. An alarm provides a notification that signifies that a problem may need to be resolved depending on a detected severity. The StorEdge Enclosure Manager monitors system status information in intervals as part of a “polling” operation. In monitoring specific hardware components, a set of “rules” are provided that define the conditions under which a notification or alarm is issued. The alarm or notification may indicate that the status is “ok”, critical in that one or more critical conditions have been detected, unrecoverable in that one or more unrecoverable conditions have occurred, or unknown.
In the StorEdge Enclosure Manager, a file monitoring class lexically analyzes strings of messages written to an administrative file to which system status information is written. If there is a match between state information in the administrative file and a rule, then the Enclosure Manager may write data to a log file and/or generate an alarm. Some of the system components and resources that may be monitored include the disks, a Gigabit Interface Converter (GBIC) module that converts electrical signals to optical signals, the power supply, system temperature, fan status, loop status of the connection between host and storage system, backplane status, etc. With the prior art Enclosure Manager, the user may specify an e-mail or pager address for remote reporting of alarms, the time interval for polling of resources, etc. The SUN Component Manager provides similar monitoring services for a storage subsystem, and is described in the SUN publication “Sun StorEdge Component Manager 2.0 User's Guide” (Copyright SUN, January 2000).
The rule system of prior art system monitoring tools, such as those discussed, above, have rules that specify a particular action when a threshold value is reached. Such systems may generate excessive notifications if system resource values are experiencing thrashing, i.e., constantly changing and thereby constantly triggering alarms as the state change passes the threshold value. For instance, the temperature of one or more system components may be monitored and an alarm generated when different threshold temperature values reached. With such systems, alarm notifications may be continually generated if the temperature continues to fluctuate to different threshold values that trigger the alarm.
For the above reasons, it would be desirable to provide a monitoring system that can provide a greater degree of flexibility in monitoring system states to avoid situations where alarms may be excessively generated as measured system parameters continuously fluctuate.
SUMMARY OF THE PREFERRED EMBODIMENTS
Provided is a method, system, program, and data structure for deriving state information concerning a monitored system component A status object is provided including information on a current state of the monitored system component. There are a plurality of states associated with the monitored system component, wherein each state is capable of having a state action and at least one transition condition associated with a transition state. A measured system parameter is received and a determination is made as to whether the received measured system parameter satisfies one transition condition associated with the current state indicated in the status object. If the received system parameter satisfies one transition condition, then the state action associated with the transition state associated with the satisfied transition condition is performed. The current state is set to the transition state in the status object.
In further implementations, if the transition state associated with the satisfied transition condition is the current state, then a counter is incremented.
Still further, if the transition state is the current state, then a determination is made as to whether a frequency event associated with the transition condition is satisfied. The state action associated with the current state is performed if the associated frequency event was satisfied.
Further provided is a method, system, program, and data structure for implementing a state machine to monitor a system component. A state class and status object class are provided. A status object is instantiated from the status object class, wherein the status object includes a current state variable indicating a current sate of the state machine. Multiple states of the state machine are instantiated from the state machine class, wherein each state is capable of having a state action and notification performed when transitioning to the state from another state. At least one evaluation function is generated for each state, wherein each evaluation function determines whether an operation on a measured system component satisfies a condition. A transition state is associated with each evaluation function. The status object is updated to indicate the transition state as the current state if the associated evaluation function determines that the condition is satisfied.
REFERENCES:
patent: 6131185 (2000-10-01), Coskun et al.
patent: 6405327 (2002-06-01), Sipple et al.
patent: 6434715 (2002-08-01), Andersen
patent: 6457152 (2002-09-01), Paley et al.
patent: 6594786 (2003-07-01), Connelly et al.
patent: 6601193 (2003-07-01), Liebau
patent: 6604210 (2003-08-01), Alexander et al.
patent: 6633838 (2003-10-01), Arimilli et al.
patent: 6662313 (2003-12-01), Swanson et al.
patent: 6675359 (2004-01-01), Gilford et al.
Sun Microsystems, Inc. “Sun StorEdge Component Manager 2.0 User's Guide” Jan. 2000, Revision A, Part No. 806-1579-10, pp. iii-110.
Nguyen Tin L.
Selim Dina H.
Kanrad, Raynes & Victor LLP
Sun Microsystems Inc.
LandOfFree
Method, system, and program for monitoring system components does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method, system, and program for monitoring system components, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method, system, and program for monitoring system components will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3365066