Systems and methods for exploratory analysis of data for...

Electrical computers and digital processing systems: interprogra – Event handling or event notification

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C719S310000, C719S311000, C345S619000, C345S215000

Reexamination Certificate

active

06836894

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates generally to management of distributed systems and more specifically to visualizing and analyzing events with disparate formats and with patterns on different time scales.
BACKGROUND OF THE INVENTION
As networked systems and applications became increasingly critical to the success of a business, effectively managing networked systems and applications becomes extremely important. In order to monitor networked systems and applications, a system manager (or a user) needs to monitor critical activities of systems and applications. One commonly used approach is to set up the monitored system or application to generate an event message when an important activity happens. This message, which can be generated by a device, an application or a system, is usually saved in a log file. For example, a router keeps a router log to track the status of each port and slot in the router. A networked system records important network activities, such as “cold start,” “router port up,” “link up,” etc., in a system log file. Other examples include DHCP (Dynamic Host Configuration Protocol) servers, Lotus Notes Servers, and Web Servers.
Because logs record critical activities, they are very important for managing systems, devices and applications. However, extracting information from a log file is often difficult. First, log files are of ten quite large, often containing thousands of events per day. Indeed, a web server can serve millions of requests every day, and for each, there may be a log file entry.
A second difficulty with processing log files is that entries have different formats. For example, a HTTP (HyperText Transport Protocol) server log contains byte counts requested by a user. A router log reports events for each port and slot. As a result, an unstructured textual format is used by most log files. Although the textual format is flexible in length, format and meaning, it hinders the use of many analysis tools which can only analyze structured numerical data. To deal with this difficulty, a parsing mechanism is typically introduced to translate a textual message into structured data for analysis. However, there remains the problem of determining the parsing rules. Because a textual message contains a variety of information, the process of defining parsing rules has an interactive and iterative nature. That is, a user may want to see additional information as he analyzes the data Conventionally, this iterative process is done manually in an ad-hoc fashion.
Third, log entries have varied information content. Some entries may include information about thresholds (e.g., if the event reported is a threshold violation). Other entries may report information specific to events, such as the port on which a disconnect occurred. This situation creates the following quandary. To provide uniformity in the analysis of log data, one needs to have common information for all events. Yet, to extract the full scope of information from event logs, variability in information content must be allowed.
A consequence of the foregoing observations is that event data needs to be viewed in many ways so as to account for its diversity of formats and content.
A first viewing approach, and the most commonly used approach, is to view the raw log file via a text editor. In this way, a user reads event messages line by line, and can read each message in detail. Clearly, this approach places emphasis on each individual event message. Although this is an important step for understanding event messages and diagnosing a problem, this approach can hardly be used to analyze the relationship among events. For example, an event pattern may be that a group of events happen periodically within a one hour period. In addition, this approach is not very efficient and effective for analyzing a large volume of events which may easily consume several megabytes per day.
A second class of viewing approaches is to aggregate events and analyze summary information. Summarization is one popular technique in this class. That is, event counts are calculated and reported by defined categories, such as event counts by hostname, by event types and by time. Clearly, through summarization and categorization, a large volume of original textual data is reduced to a small amount of summarized numbers for each defined categories. This greatly improves the efficiency and eases the scalability issue of the first approach. However, summarization loses details of the original event messages. This is because the summarization depends on the defined categories. The information which is not categorized is therefore invisible. In addition, information of event patterns (e.g., event A happens periodically) and relationships among events (e.g., a group of events tends to happen together) are lost because of aggregation.
A third viewing approach is to use graphical displays, referred to as event plots. One example of a two-dimensional event plot may be a plot in which an event message is represented by a point whose horizontal axis corresponds to time of the event and vertical axis corresponds to host ID of the event. In this way, thousands of events can be displayed in one screen, and the relationship among events can be visually apparent. However, this approach can not reveal the detailed information which was not parsed from a message.
Thus, there are three conventional, but mutually exclusive, different ways to analyze event logs. Each of them has its own advantages. Directly reading the textual messages provides the most detailed information of event messages. The aggregated event analysis provides a nice scaling property and shows summarization. The event plot can reveal event patterns and relationship among events.
Most available products for analyzing a log file specialize on one type of log file. For example, there are many products on the market aiming to analyze HTTP log files (see Web Trend http://www.webtrends.com; Hit List:http://www.marketwave.com; ARIA: http://www.ANDROMEDIA.com; and Web Tracker: http://www.cqminc.com. All of these special log analyzers only support summarization analysis. None of them can be used to visualize event messages and/or see original messages.
On the other spectrum, there are many general graphical tools, such as Diamond, Data explorer, SAS, PowerPlay, etc. These tools aim to support either graphical analysis of numerical data, such as Diamond, Data explorer, SAS, etc., or aggregated level summarization such as PowerPlay and other OLAP (On Line Analytical Process) products. However, none of them provide both types of analysis. In addition, these tools usually only take structured data as inputs and can not handle textual data directly.
Therefore, it would be highly desirable to provide systems and methods which integrate two or more of these different analysis approaches, thus providing a user with the capability and flexibility to perform multiple types of analysis on raw data for event management purposes.
SUMMARY OF THE INVENTION
The present invention provides systems and methods for providing exploratory analysis of data for event management. In one aspect, the invention provides for a methodology and related system referred to hereinafter as an “event browser” that provides an integrated environment for analysis of a large volumes of semi-structured or non-structured data, such as event logs.
In an illustrative embodiment of the invention, the event browser advantageously provides: (1) scalable analysis of large volumes of unstructured data with diverse content and data formats; (2) an architecture to support multiple types of views and analyses of such data; (3) mechanisms to support the iterative refinement of the information in the raw data that is included in the visualization and analysis environment; (4) several specific viewers for analysis of event data.
An event browser of the invention is implemented in a form which includes certain functional components. These components, as will be explained, may be implemented as one or more software modules on one o

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Systems and methods for exploratory analysis of data for... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Systems and methods for exploratory analysis of data for..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Systems and methods for exploratory analysis of data for... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3293065

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.