Database system and method of organizing an n-dimensional...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Database system and method of organizing an n-dimensional... Database system and method of organizing an n-dimensional...

: 2002-01-10
: 2003-01-21
: Coby, Frantz (Department: 2172)
: Data processing: database and file management or data structures
: Database design
: Data structure types

: C707S793000, C707S793000, C707S793000
: Reexamination Certificate
: active
: 06510435
: ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to a database system and a method of organizing a data set existing in an n-dimensional cube with n>1.
BACKGROUND OF THE INVENTION
The so-called B tree (or also B* tree or prefix B tree) is known as the data structure to organize extensive one-dimensional volumes of data in mass memories such as magnetic disk memories. The data structure of the B tree over that of a simple search tree has the advantage that lower search times are required for data access. The resulting search time to locate certain data involves at least log
2
(n) steps with a simple search tree with n nodes. With a search tree with 1,000,000 nodes, log
2
(1,000,000)≈20 disk access operations must therefore be expected. Assuming a mean access figure of 0.1 sec., the search of one node will require 2 secs. This value is too large in practice. With the data structure of the B tree, the number of disk access operations is reduced by transferring not one single node, but a whole segment of the magnetic disk allocated to a node to the main memory and searching within this segment. If, for example, the B tree is divided into areas of seven nodes each and if such an area is transferred into the main memory with each disk access operation, the number of disk access operations for the search of a node is reduced from a maximum of 6 to a maximum of 2. With 1,000,000 nodes, only log
8
(1,000,000)=7 access operations are therefore required. In practice, the search tree is normally divided into partial areas with a size of 2
8
−1 to 2
10
−1 nodes. With an area size of 255 nodes, log
6
(1 m)>2.5 disk access operations are required for the search of a node in a tree with 1,000,000 nodes so that the search for a given value takes only around 0.3 secs. The search time within a partial area with 255 nodes located in the main memory can be neglected in comparison with the disk access operation. The B tree is a vertically balanced tree in which all leaves are located at the same level.
The so-called dd trees are known from K. Mehlhorn:
Multidimensional searching and computational geometry
, Springer, Heidelberg 1984, to organize a multidimensional data set. With the dd trees, three types of queries can be performed in principle, namely point queries, area queries and queries where some intervals are given as (−infinite, +infinite). However, the data structure of a dd tree only allows fast access for point queries, as then only one path in the tree needs to be searched. With the other queries, it is possible that the whole tree has to be searched. Moreover, dd trees are static, i.e. the whole object volume to be organized must already be known before the dd tree can be set up. However, in most applications in practice, the object volume is dynamic, i.e. it must be possible for objects to be inserted into or deleted from the tree in any order and at any time without the whole tree having to be set up again from the start. Furthermore, dd trees are only suitable for main memory applications, but not for peripheral memories which are needed to store very large volumes of data.
In “
The Grid File
” by Nievergelt et al, ACM TODS, Vol 9, No. 1, March 1984, so-called grid files are described to organize multidimensional data where queries for points and areas are performed on the basis of an index structure, the so-called grid.
Although this data organization allows a fast search for point and area queries, it is a static procedure so that the total index structure has to be completely reorganized regularly when data objects are inserted or deleted dynamically. This method is thus not suitable for many applications, in particular not for online applications.
So-called R trees are known as the data structure to organize multidimensional data from A. Guttmann:
A dynamic index structure for spatial searching, Proceedings ACM SIGMOD, Intl. Conference on Management of Data
, 1984, pages 47-57. These trees, which are used mainly for so-called geo-databases, are vertically balanced like B trees and also allow the dynamic insertion and deletion of objects. However, no fast access times are guaranteed for the response to queries, because under certain circumstances any number of paths in the corresponding tree, in extreme cases even the whole tree, have to be searched to answer a query. As a result, these R trees are not suitable for most online applications.
From Y. Nakamura et al:
Data structures for multi-layer n-dimensional data using hierarchical structure
, 10
th International Conference on pattern recognition, Volume
2, Jun. 16, 1990, IEEE Computer Society Press, New Jersey, USA, pages 97-102, a splitting method is known for a multidimensional rectangular space. In the known method, a given multidimensional rectangular space is split into two sub-spaces as soon as the number of data points in the space exceed the capacity of one data page. The splitting of the starting space is performed by cutting out a partial rectangular. The spatial structures newly created by this splitting, namely a cut-out rectangular and the rest of the starting space are structured as layers in a BD tree with the tree structure being created depending on the sequence of the cutting out of the individual partial spaces in the event of multiple cut-out partial spaces. The BD tree structure created in such a way represents a binary tree in which it is determined at each branch node which rectangle will be cut out as the new BD partial space. This successive cutting out has the consequence that the BD tree grows downwards so that in the insertion, deletion and searching of data points, i.e. data objects, in the total space a path has to be passed through from the tree root to a leaf (branch end). Here, it is necessary to check at every intermediate node whether a point being searched for is located in the associated cut-out partial space or in the complementary rest space. The search effort can thus grow proportionally with the size of the data set, which leads to a poor efficiency behavior with large and very large data sets.
The most widespread method in practice today to organize a multidimensional data set is based on the original one-dimensional B trees, with one B tree in each case being used for each dimension of the starting data set so that area queries in an n-dimensional data set are supported by n B trees. In an area query, all objects are thus obtained from the peripheral memory for each dimension whose values are located in the interval specified in the query for this dimension. These data objects form the hit number in the corresponding dimension. To determine the desired answer number, a mean number of the hit numbers of all dimensions must be computed, which will normally first require the sorting of these numbers. When a data object is inserted or deleted, n B trees must also be searched and modified correspondingly.
OBJECTS AND SUMMARY OF THE INVENTION
On this basis, the object of the invention is to provide a database system and a method of organizing an n-dimensional data set which, thanks to improved access times, is, in particular, suited for use in online applications and which allows a dynamic insertion and deletion of data objects.
In accordance with the invention a database system and a method to organize an n-dimensional data set is proposed to solve this object. The database system in accordance with the invention comprises a computing apparatus, a main memory and a memory device, which is in particular a peripheral memory device. The basic idea of the invention is to place a multidimensional data set to be organized in a multidimensional cube and to perform a repeated iterative division of the multidimensional cube in all dimensions into sub-cubes to index and store this data set by means of the computing apparatus. The division is repeated so often here until successive sub-cubes can be combined into regions which each contain a set of data objects which can be stored on one of the memory pages of given storage capacity of the in particular p

Affiliated with

Bayer Rudolf

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Coby Frantz

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Shlesinger & Arkwright & Garvey LLP

Law Firm

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Database system and method of organizing an n-dimensional... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Database system and method of organizing an n-dimensional..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Database system and method of organizing an n-dimensional... will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-3046981

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure