Data processing: measuring – calibrating – or testing – Measurement system – Performance or efficiency evaluation
Reexamination Certificate
1998-01-26
2001-01-30
Hoff, Marc S. (Department: 2857)
Data processing: measuring, calibrating, or testing
Measurement system
Performance or efficiency evaluation
C702S185000, C714S037000
Reexamination Certificate
active
06182022
ABSTRACT:
FIELD OF THE INVENTION
The present invention pertains to fault analysis, and, more particularly, to a method and system for automatically constructing a baseline, deriving a threshold, and reconfiguring a fault detection system with the derived threshold.
BACKGROUND OF THE INVENTION
The design, maintenance, operation, and or repair of a system, whether it is a computer network, an electronic subassembly undergoing fabrication on a manufacturing line, an airport or traffic control system, or any other type of system, is assisted by use of a fault detection system. Present day fault detection systems typically monitor various system parameters of the monitored system, determine whether they conform to desired operating thresholds, and notify the appropriate entity when the monitored parameters move outside the limits defined by the desired operating thresholds. These types of fault detection systems are useful for alerting a system administrator, design engineer, or manufacturing line operator of faults occurring in the monitored system. In the past, however, diagnosis of the monitored system's problems has been left to the experience of the appropriate engineering resources to determine which areas of the system to fix, and in which order.
Accordingly, a need exists for a method and system for automatically identifying the attributes of a monitored system which cause or exhibit system problems. In addition, in an environment where only a limited number of available engineering resources are available, or in which limited time or funds are available, a need also exists for a method and system that automatically prioritizes the allocation of engineering resources to those areas of the monitored system where the expenditure of the resources provide the most benefit.
Present day fault detection systems which are designed to detect when a system attribute is out of a normal operating range, typically operate by comparing realtime attribute measurement values, or “metrics”, with a statically configured threshold value. The threshold is determined, based upon theoretical equations or experience, and manually set by a system engineer. Present day fault detection systems range from providing either only one or a small few globally applicable thresholds, up to many individual thresholds tailored to each respective attribute. A system configured with a single or only a small few globally available thresholds is easier to maintain and requires less manual intervention by a system engineer, but does so at the cost of flexibility and the ability to tailor a threshold according to the normal operating range of each individual attribute. More sophisticated fault detection systems allow more control over the ability to pinpoint faults by employing more thresholds which are respectively tailored to a single or only a few attributes. These systems, however, are very costly in terms of the engineering time required for interpretation of the observed data used to determine each individual threshold, and in terms of the manual intervention required to set each individual threshold. Accordingly, a need exists for a system and method for automatically constructing a normal operating range, or “baseline”, for each individual attribute, deriving a threshold for each attribute, and reconfiguring the fault detection system with the derived thresholds.
SUMMARY OF THE INVENTION
The present invention is a system and method for automatically constructing a baseline for an attribute of a monitored system, calculating a threshold based on the constructed baseline, and feeding the threshold back into the monitored system. In addition, the invention also provides a system and method for automatically identifying the attributes of a monitored system which are detected to be substantially outside their normal operating range, and for prioritizing the allocation of engineering resources to those areas of the monitored system where the expenditure of the resources provide the most benefit.
In accordance with the invention, a metric corresponding to an attribute of interest of a monitored system is extracted and compared with a current normal threshold associated with the attribute. An event notification is generated if the extracted metric is not within a limit defined by the current normal threshold. A baseline is calculated based on a relevant subset of extracted metrics, from which a new current normal threshold is calculated. The current normal threshold is reconfigured with the new current normal threshold. In preferred embodiments, an alarm is generated if one or more event notifications meet the conditions of a rule, such as a duration rule which requires the collected metrics to be beyond the current normal threshold for a pre-determined amount of time, or a frequency rules which requires a pre-determined number of metrics to be beyond the current normal threshold during a pre-determined amount of time. In addition, a newly calculated current normal threshold is preferably limited to a service level limit which defines a boundary of the acceptable level of operation of the attribute if the newly calculated current normal threshold is not within that limit. A service level exception is generated if the newly calculated current normal threshold is limited to said service level limit to indicate that the current normal threshold itself is out of control. Reports are generated which summarize the performance of the monitored attributes, indicate which monitored attributes are out-of-control, and prioritize the order in which out-of-control attributes receive available engineering resources.
A fault detection system in accordance with the invention includes a data collector which extracts metrics corresponding to various attributes of interest from a monitored system. The fault detection system includes a threshold comparator configured with a current normal threshold for each monitored attribute and which compares each extracted metric to its corresponding current normal threshold. If the extracted metric is not within a limit defined by its current normal threshold, an event notification is generated. A statistical analyzer is coupled to the data collector which calculates a baseline based on a relevant subset of previously collected extracted metrics. A threshold processor is coupled to the statistical analyzer and calculates a new current normal threshold based on the calculated baseline. A threshold implementor then reconfigures the current normal threshold associated with a given attribute with its newly calculated current normal threshold. An event processor receives event notifications and generates alarms when the event notifications satisfy one or more rules or conditions on which to alarm. A sanity checker limits newly calculated current normal thresholds to a service level limit which defines a boundary of an acceptable level of operation of an attribute and a service level exception generator generates a service level exception if the newly calculated current normal threshold does not come within the service level limit. A report generator identifies those monitored attributes which are adversely affecting performance of the monitored system.
REFERENCES:
patent: 4378494 (1983-03-01), Miller
patent: 5339257 (1994-08-01), Layden et al.
patent: 5828786 (1998-10-01), Rao et al.
patent: 5941996 (1999-08-01), Smith et al.
Clubb Jayson A.
Mayle Gary E.
Reves Joseph P.
Wilson Loren F.
Barbee Manuel L.
Hewlett--Packard Company
Hoff Marc S.
LandOfFree
Automated adaptive baselining and thresholding method and... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Automated adaptive baselining and thresholding method and..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Automated adaptive baselining and thresholding method and... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2442826