Content-based retrieval of series data

Image analysis – Pattern recognition – Template matching

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C382S209000, C382S218000

Reexamination Certificate

active

06754388

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to data series. More particularly, the present invention relates to retrieval of data contained in large sequences of data.
BACKGROUND
In many industries, large stores of data are used to track variables over relatively long expanses of time or space. For example, several environments, such as chemical plants, refineries, and building control, use records known as process histories to archive the activity of a large number of variables over time. Process histories typically track hundreds of variables and are essentially high-dimensional time series. The data contained in process histories is useful for a variety of purposes, including, for example, process model building, optimization, control system diagnosis, and incident (abnormal event) analysis.
Large data sequences are also used in other fields to archive the activity of variables over time or space. In the medical field, valuable insights can be gained by monitoring certain biological readings, such as pulse, blood pressure, and the like. Other fields include, for example, economics, meterology, and telemetry.
In these and other fields, events are characterized by data patterns within one or more of the variables, such as a sharp increase in temperature accompanied by a sharp increase in pressure. Thus, it is desirable to extract these data patterns from the data sequence as a whole. Data sequences have conventionally been analyzed using such techniques as database query languages. Such techniques allow a user to query a data sequence for data associated with process variables of particular interest, but fail to incorporate time-based features as query criteria adequately. Further, many data patterns are difficult to describe using conventional database query languages. Moreover, the lack of an intuitive interface impairs efficiency for many users.
In order to facilitate querying data sequences, so-called graphical query languages have been developed that offer a graphical user interface (GUI) to enter standard query language commands. Even using these graphical query languages, however, it is difficult to specify temporal feature sets or patterns that characterize events of interest.
Another obstacle to efficient analysis of data sequences is their volume. Because data sequences track many variables over relatively long periods of time, they are typically both wide and deep. As a result, the size of some data sequences is on the order of gigabytes. Further, most of the recorded data tends to be irrelevant. Due to these challenges, existing techniques for extracting data patterns from data sequences are both time consuming and tedious.
SUMMARY OF THE INVENTION
According to one aspect of the present invention, a graphical user interface (GUI) is used to quickly and easily find data patterns within a data sequence that match a target data pattern representing an event of interest. The user first uses the GUI to specify the target data pattern. Search criteria, such as a match threshold and amplitude and duration constraints, are then specified. A pattern recognition technique is then applied to the data sequence to find data patterns within the data sequence that satisfy the search criteria. Thus, the user avoids the need to sift through large amounts of data not relevant to the current query.
According to one embodiment, the present invention is directed to a method for finding, within a data sequence, matching data patterns that satisfy a similarity criterion with respect to a target data pattern. A graphical representation of at least a portion of the data sequence is displayed using a GUI. The GUI is then used to define the target data pattern within the data sequence and the similarity criterion. A pattern recognition algorithm is then applied to the data sequence to find the matching data patterns that satisfy the similarity criterion with respect to the target data pattern.
In another embodiment, a target data pattern within the data sequence and at least one search constraint are defined using a GUI. A pattern recognition algorithm is applied to the data sequence to find matching data patterns that satisfy the search constraint with respect to the target data pattern. These matching data patterns are then presented to the user.
Still another embodiment is directed to a method for finding, within a data sequence, matching data patterns that satisfy a similarity criterion with respect to a target data pattern. A graphical representation of at least a portion of the data sequence is displayed using a graphical user interface. The target data pattern within the data sequence and the similarity criterion are then defined using the graphical user interface. Next, a plurality of temporally warped versions of at least a portion of the target data pattern are prepared. At least one of these temporally warped versions is compared to at least a portion of the data sequence to determine a plurality of candidate data patterns within the data sequence that satisfy a match threshold with respect to the compared at least one temporally warped version. Candidate data patterns that violate amplitude limits are rejected.
Other embodiments are directed to computer-readable media and computer arrangements for performing these methods.


REFERENCES:
patent: 4975865 (1990-12-01), Carrette et al.
patent: 5353355 (1994-10-01), Takagi et al.
patent: 5787425 (1998-07-01), Bigus
patent: 5799300 (1998-08-01), Agrawal et al.
patent: 5799301 (1998-08-01), Castelli et al.
patent: 5799310 (1998-08-01), Anderson et al.
patent: 5809499 (1998-09-01), Wong et al.
patent: 5832182 (1998-11-01), Zhang et al.
patent: 5832183 (1998-11-01), Shinohara et al.
patent: 5832456 (1998-11-01), Fox et al.
patent: 5930789 (1999-07-01), Agrawal et al.
patent: 5940825 (1999-08-01), Castelli et al.
patent: 6182069 (2001-01-01), Niblack et al.
patent: 6226388 (2001-05-01), Qian et al.
patent: 6275229 (2001-08-01), Weiner et al.
patent: 6308172 (2001-10-01), Agrawal et al.
patent: 0742525 (1996-11-01), None
Zaïane, et al (Discovering Web Access Patterns and Trends by Applying PLAP and Data Mining Technology on Web Logs*), IEEE, pp. 1-11, 1998.*
Faloutsos, et al. “A Fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets”, IEEE, pp. 163-174, 1995.*
Faloutsos, et al. “Fast subsequence matching in time-series databases”, IEEE, pp. 419-429, 1994.*
Patent Abstracts of Japan, vol. 1998, No. 14, Dec. 31, 1998 & JP 10 240716 A (NEC Corp). Sep. 11, 1998 Abstract.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Content-based retrieval of series data does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Content-based retrieval of series data, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Content-based retrieval of series data will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3364503

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.