Electrical computers and digital processing systems: multicomput – Computer network managing – Computer network monitoring
Reexamination Certificate
1999-03-04
2003-05-06
Lim, Krisna (Department: 2153)
Electrical computers and digital processing systems: multicomput
Computer network managing
Computer network monitoring
C709S223000
Reexamination Certificate
active
06560647
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to the collection, analysis, and management of system resource data in distributed or enterprise computer systems, and particularly to a system and method for reducing file sizes in an intelligent way.
2. Description of the Related Art
The data processing resources of-business organizations are increasingly taking the form of a distributed computing environment in which data and processing are dispersed over a network comprising many interconnected, heterogeneous, geographically remote computers. Such a computing environment is commonly referred to as an enterprise computing environment, or simply an enterprise. Managers of the enterprise often employ software packages known as enterprise management systems to monitor, analyze, and manage the resources of the enterprise. Enterprise management systems may provide for the collection of measurements, or metrics, concerning the resources of individual systems. For example, an enterprise management system might include a software agent on an individual computer system for the monitoring of particular resources such as CPU usage or disk access. The enterprise management agent might periodically collect metric data and write to a “data spill” containing historical metric data, i.e., metric data previously collected over a period of time. U.S. Pat. No. 5,655,081 discloses one example of an enterprise management system.
Historical data spills can be useful in a number of circumstances. First, even where an enterprise management system permits real-time monitoring of metric data, the enterprise is not always monitored for twenty-four hours a day, seven days a week. Thus, historical data spills provide a way to review metric data that was not monitored in real time. Second, regardless of whether metrics are monitored in real time, an enterprise manager may desire to review the history of one or more metrics which preceded a problem in another, related metric. Third, historical data spills can be used for analysis of the enterprise. For example, an analysis of the most frequent clients of a particular file server in the enterprise would utilize historical metric data. For these reasons, enterprise managers desire to keep track of as much historical metric data as possible. However, storage space and other resources are finite and not without cost. Therefore, the enterprise manager faces a trade-off between using costly storage resources on the one hand and throwing away meaningful metric data on the other hand. The object, then, is to reduce the amount of data stored while throwing out as little meaningful data as possible.
The prior art has produced a variety of compression techniques for reducing file size. Some compression methods are “lossless”: they compress data by looking for patterns and redundancies, losing no information in the process. File-level and disk-level compression techniques for computer systems are lossless methods. Unfortunately, lossless methods typically achieve low compression rates, and so their usefulness is limited, especially for large, relatively patternless spills of metric data. Other compression methods are “lossy”: they typically achieve higher compression rates than lossless methods, but they lose information in the process. For example, techniques for compressing video and image data commonly eliminate pixel-to-pixel variances in color that are barely noticeable to the human eye. In other words, those methods determine the least necessary data by comparing pixels to one another, and then the methods discard that data. However, techniques for compressing metric data cannot so rely on the deficiencies of human perception. Often, compression techniques of the prior art compress metric data by decimating it: in other words, by simply throwing away every Nth element of a data spill, or by keeping every Nth element of a data spill. Decimation methods thus use a “brute force” approach with the result that the meaningful and the meaningless alike are discarded. The methods of the prior art employ a “one size fits all” methodology: they treat all bits and bytes the same, no matter what meaning those bits and bytes may hold. The methods do not look beyond the mere logical ones and zeroes to appreciate the significance of the data. Therefore, both the lossless and the lossy compression methods of the prior art are inadequate to solve the enterprise manager's dilemma.
For the foregoing reasons, there is a need for a system and method for reducing file sizes in an intelligent way.
SUMMARY OF THE INVENTION
The present invention is directed to a system and method that solve the need for intelligent summarization of data. Preferably, the present invention provides improved management of collected metric data through summarization of data according to the semantics or meaning of the underlying data types, and also through summarization of data at a plurality of levels of varying granularity. In a preferred embodiment, the system and method are used in a distributed computing environment, i.e., an enterprise. The enterprise comprises a plurality of computer systems, or nodes, which are interconnected through a network. At least one of the computer systems is a monitor computer system from which a user may monitor the nodes of the enterprise. At least one of the computer systems is an agent computer system. An agent computer system includes agent software that permits the collection of data relating to one or more metrics, i.e., measurements of system resources on the agent computer system.
In a preferred embodiment, a Universal Data Repository (UDR) receives a set of data points from one or more agent computer systems. The set of data points is a series of metrics, i.e., measurements of one or more system resources, which have been gathered by data collectors on the agent computer systems over a period of time. The UDR preferably summarizes the set of data points into a more compact yet meaningful form. In summarization according to one embodiment, the UDR determines a data type of the set of data points, applies a summarization rule according to the data type, and then creates a summarized data structure which corresponds to the set of data points. The UDR may summarize multiple sets of data points in succession.
In one embodiment, the summarization rule varies according to the semantics, i.e., the meaning, of the data type. For example, if the data type of the collected metric data is a counter, i.e., a measurement that can only go up, then the summarized data structure will comprise the starting value, ending value, and total number of data points. On the other hand, if the data type of the collected metric data is a gauge, i.e., a measurement that can go up or down, then the summarized data structure will comprise the average of all the data points and the total number of data points. If the data type of the collected metric data is a clock, i.e., a measurement of elapsed time, then the summarized data structure will comprise the starting value, the ending value, and the frequency of the clock. If the data type of the metric data is a string, i.e., a series of characters which can be manipulated as a group, then the summarized data structure will comprise the first string. By applying different summarization rules keyed to different data types, the system and method preserve costly storage resources by taking the most meaningful information and putting it into smaller packages.
To decrease file size even further, in one embodiment the system and method also provide for multiple levels of summarization: as new metric data is received, previously received data is summarized into coarser data structures, wherein the degree of coarseness corresponds to the age of the data. After the: metric data has been collected by an agent, the UDR summarizes raw data points into summarized data structures. Each summarized data structure corresponds to two or more of the raw data points. At later times, as new raw data is collected, the UDR summa
Agrawal Subhash
Hafez Amr
Rocco Joseph
BMC Software Inc.
Lim Krisna
Wong, Cabello, Lutsch, Rutherford & Brucculeri L.L.P.
LandOfFree
Enterprise management system and method which includes... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Enterprise management system and method which includes..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Enterprise management system and method which includes... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3071257