Apparatus for data decomposition and method and storage...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C707S793000, C379S010030, C379S030000, C382S181000, C382S190000, C382S209000, C709S243000

Reexamination Certificate

active

06341283

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a technology for extracting a collection of related data from randomly accumulated data.
2. Description of the Related Art
Extensive and sundry information is today accumulated on computers. Extensive information alone, however, is incomprehensible to human beings. For this reason, data mining and multi-variable analysis receive much attention. The principal purpose of these types of technologies is two-fold:
(1) to extract structures from information and use these structures in, illustratively, estimates; and
(2) to compress data to a size amenable to human comprehension, and to make that data visual.
As data accumulation technologies have become less expensive, the increase in blindly accumulated data has brought about qualitative changes in the data. In other words, in the past it was possible to estimate in advance the small number of causal relationships existing in the data sampling performed in respect of data predisposed to a particular purpose. In new data types, however, there are scattered a plurality of causal relationships that have not been anticipated.
Even where an archetypal multi-variable analysis method, such as factor analysis, is used simply to analyze data wherein multiple causal relationships are scattered, it is difficult to obtain valid results. Human beings utilize knowledge relating to the features of the area in which data is accumulated and forecast the types of relationships subsisting in the data. It is thus necessary to divide the problems in advance. This type of task is quite costly and is, therefore, to the extent possible, delegated to computers.
To date, research respecting a technology for extracting or selecting features has been undertaken in the following fields: multi-variable analysis, pattern recognition, neural networks, and case-based reasoning. The term “feature”, as herein used, is defined as follows. By way of illustration, assume that height measurements are taken with respect to a plurality of persons. A height measurement or a weight measurement or, alternatively, age or gender information, for one of these persons is a quantity that indicates a feature of that person. Although characterized as a “quantity”, gender, for example, is responsive to only two classifications, namely, “male” and “female”. It may therefore seem awkward to use the term “quantity” here. However, because it is one factor used to characterize a person, the term is used even in this instance. Furthermore, one record, which is the result of measurements of feature quantities with respect to one person, corresponds to the job of taking measurements with respect to that person. Accordingly, one such record in, illustratively, a database, is referred to as one event. In addition to these events, where a system operation is measured in a time series, it is possible to call measurements performed each hour events. In this case, the quantity characterizing a system operation acquired in one measurement is a feature quantity.
To date, a variety of technologies have been proposed for extracting relevance among data from extensive data. However, most of these are technologies that extract only feature quantities having relevance among data, or that extract events possessed of a specific relevance.
As discussed above, however, recent years disclose a trend toward the blind accumulation of data. It is not always the case that the accumulated data are relevant to all the varieties of the feature quantities obtained. Even where the accumulation is confined to specific varieties of feature quantities, there is no guarantee that the accumulated data will bear relevance to these feature quantities. Accordingly, as regards data that are accumulated blindly, it is necessary to extract a combination of specific events having relevance with respect to a combination of specific feature quantities.
SUMMARY OF THE INVENTION
An objective of the present invention is to provide a technology that extracts mutually relevant data from among data in which a prescribed variety of feature quantities is correlated to a plurality of events, by combining feature quantities and events.
The data decomposition apparatus contemplated by the present invention is a data decomposition apparatus that extracts partial data from whole data, by selecting, with respect to each record and from data which has cataloged a plurality of attributes possessed by each record, the combinations of each event corresponding to each record and the combinations of feature quantities, these quantities comprising the attributes. The data decomposition apparatus according to the present invention further comprises: (a) means for figuring, with respect to combinations of specific feature quantities combinations of specific events, an evaluation value that becomes the standard against which the relevance among data is evaluated; and (b) means for extracting a plurality of partial data for which the evaluation value is the maximum value with respect to changes in both the feature quantity combinations and the event combinations.
The data decomposition method contemplated by the present invention is a data decomposition method that extracts partial data from whole data, by selecting, with respect to each record and from data cataloging a plurality of attributes possessed by each record, the combinations of each event corresponding to each record and the combinations of feature quantities, these quantities comprising the attributes. The data decomposition method according to the present invention further comprises the steps of: (a) figuring, with respect to combinations of specific feature quantities and combinations of specific events, an evaluation value that becomes the standard against which the relevance among data is evaluated; and (b) extracting a plurality of partial data for the evaluation value is the maximum value with respect to changes in both the feature quantity combinations and the event combinations.
In the present invention, the mutually relevant records (events) and the feature quantities associated with each event from the blindly accumulated data are selected and extracted from an assembly of a totality of events and an assembly of feature quantities. Accordingly, it is possible easily to extract mutually relevant data, without human intervention for the purpose of sifting through the data.
This technology can be used effectively as an antecedent process to processes for finding interrelations in a fixed interval of data among data gathered, such as illustratively, multi-variable analysis, data mining, and pattern recognition.


REFERENCES:
patent: 5615341 (1997-03-01), Agrawal et al.
patent: 5713016 (1998-01-01), Hill
patent: 5758147 (1998-05-01), Chen et al.
patent: 5794209 (1998-08-01), Agrawal et al.
patent: 5870748 (1999-02-01), Morimoto et al.
patent: 5983222 (1999-11-01), Morimoto et al.
patent: 6189005 (2001-02-01), Chakrabarti et al.
patent: 0735 497 (1996-10-01), None
patent: 9-134363 (1997-05-01), None
patent: 9-134365 (1997-05-01), None
R. Agrawal et al.: Mining Association Rules between Sets of Items in Large Database, ACM Cat. No. 0-89791-592, pp. 207-216, May. 1993.*
R. Agrawal et al. “Database mining : A performance Perspective”, IEEE Log No. 9212793, Cat. No. 1041-4347, pp. 914-925, Jun. 1993.*
Hiroshi Yamakawa : “Matchability-Oriented Feature Selection for Recognition Structure Learning”, IEEE Cat. No. 1015-4651, pp. 123-127, 1996.*
Hirotugu Akaike, “A New Look at the Statistical Model Identification”, IEEE Transactional Automatic Control, vol. AC-19, No. 6, Dec. 1974, pp. 716-723.
J. Rissanen, “Universal Coding, Information, Prediction and Estimation”, IEEE Trans. on IT, vol. IT-30, No. 4, pp. 629-636, Jul. 1984.
J.R. Quinlan and R.L. Rivest, “Inferring Decision Trees Using the Minimum Description Length Principle” Information and Computation, vol. 80, No. 3, pp. 227-248, Mar. 1989.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Apparatus for data decomposition and method and storage... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Apparatus for data decomposition and method and storage..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Apparatus for data decomposition and method and storage... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2818851

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.