Use of the UNPIVOT relational operator in the efficient gatherin

Data processing: database and file management or data structures – Database design – Data structure types

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

707 3, 707 4, 707 6, 707102, G06F 1730

Patent

active

060443669

ABSTRACT:
The invention concerns a method and apparatus for generating a tabulation of counts of occurrences of value combinations of a set of attributes over a relation consisting of a set of database records. The gathered counts (also referred to as sufficient statistics) of attribute occurrences or correlation counts is most preferably used in building a classification or density estimation model from the database records that can be used to predict some attribute values based on other attribute values. A new SQL operator designated the `UNPIVOT` operator operates by scanning the database records and for each record reorganizes that data to form an UNPIVOTED data record that include the combinations of attribute name, attribute value and the values for one or more selected class attributes. The UNPIVOTED table can be used to produce the desired sufficient statistics in one scan of the data using standard database engines. While materialization of UNPIVOTED table would cause a large added scan cost overhead, the UNPIVOT operator allows us to achieve the counts without the added cost by combining the UNPIVOT operator with other SQL `select` and `group by` operators the UNPIVOTED table can be counted without the need for materializing it. The result is a guaranteed one pass algorithm that does not incur the added scan cost factor. The savings in scan cost can extend to several orders of magnitude compared to other methodologies for getting the counts supported by current database engines. The sufficient statistics so gathered can be used to drive a variety of data mining algorithms.

REFERENCES:
patent: 5201047 (1993-04-01), Maki et al.
patent: 5511190 (1996-04-01), Sharma et al.
patent: 5710915 (1998-01-01), McElhiney
patent: 5713020 (1998-01-01), Reiter et al.
patent: 5748905 (1998-05-01), Hauser et al.
patent: 5819282 (1998-10-01), Hooper et al.
patent: 5878426 (1999-03-01), Plasek et al.
patent: 5966139 (1999-10-01), Anupam et al.
Automating the Analysis and Cataloging of Sky Surveys, Advances in Knowledge discovery and Data Mining, Chapter 19, pp. 471-493, Fayyad, Usama M., Djorgovski, S. George and Weir, Nicholas, AAAI Press/The MIT Press, Menlo Park California; Cambridge, MA; London, England (1996).
SLIQ: A Fast Scalable Classifier for Data Mining, EDBT Conference, Mehta, Manish, Agrawal, Rakesh and Rissanen, Jorma (1996).
From Digitized Images to Online Catalogs. AI Magazine, pp. 51-66, Fayyad, Usama M., Djorgovski, S.G. and Weir, Nicholas, (Summer 1996).
Data Mining and Knowledge Discovery: Making Sense Out of Data,Fayyad, Usama M. Submitted to IEEE Expert, for publication in the special issue on Data mining (Aug., 1996).
Panning for Data Gold, New Scientist. No. 2031, Weekly E1-80, pp. 31-33 (May 25, 1996).

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Use of the UNPIVOT relational operator in the efficient gatherin does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Use of the UNPIVOT relational operator in the efficient gatherin, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Use of the UNPIVOT relational operator in the efficient gatherin will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-1334869

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.