Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
1998-07-15
2001-10-23
Shah, Sanjiv (Department: 2172)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000
Reexamination Certificate
active
06308169
ABSTRACT:
FIELD OF INVENTION
The present invention relates to a method of storing elements in a database and of finding such stored elements. A reference to a storage space intended for the storage of a data element in the database is calculated by means of a mathematical function, such as a hash function, wherein the function input data is comprised of an external key belonging to the element, and wherein the result obtained with the mathematical function points to an internal position in the database for said element. The invention is also based on dividing the result of the function into different parts.
BACKGROUND OF THE INVENTION
It has long been known to store data in a database with the aid of different data structures. These structures provide different advantages with respect, for instance, to used storage space and the possibility of carrying out rapid searches.
One known structure is based on a so-called hash function which enables a random distribution of elements in a table in a database to be calculated.
An element can include an internal address in a database. The element is coupled to an external key, such as a telephone number, for instance. When entering a new telephone number in a table, a calculation is made as to in which element the internal address of the telephone number shall be placed with the aid of a hash function.
The purpose of the hash function is to distribute the addresses randomly in the table so that these addresses will be distributed uniformly therein. The same hash function is then used to find the correct element when searching for a specific address.
When using a hash function on a key, there is obtained a hash value that is comprised of a data-word which normally includes 32 bits.
One type of algorithm used in this respect is linear hashing (LH), where the data-word obtained from the function includes so-called LH-bits, among other things. Put simply, the LH-bits obtained can point to a page in the database, or to a position in a register where said position points to a page, and the sought element is located somewhere on this page.
It is known to perform hashing in two stages, in order to make the search for an element more effective. The data-word obtained is divided into two groups of bits in these two stages.
The first group of bits point, for instance, to an index in a vector and this index can point to a new vector. The second group of bits points to an index in the new vector, wherein this new index points, for instance, to a first element in a linked list of elements. The sought element is then found somewhere in this linked list of elements.
Such linked lists may have mutually different lengths. An internal address lies in each element. The various links are placed physically at totally different locations within a memory, but are linked together by means of pointers.
Thus, when searching for a certain element, there is obtained the header of a linked list of elements. If the internal address sought for is not found in the first element, the search is continued in the next element in the linked list. Since the next element in the linked list is positioned physically somewhere else in the memory, further access to the memory is normally required in order to pick-up this next element. This first pick-up of the wrong element is a so-called cache-miss.
Thus, a person searching for a certain element may suffer several cache-misses, depending on the length of the lists, which is a very expensive process. In next generation processors, a cache-miss will be corresponded time-wise by hundreds of instructions.
It is therefore desirable to minimise the number of cache-misses. An ideal situation is one in which a sought element can be found when the memory is first accessed.
It is also known, according to the aforegoing, to allow a hash value to point to a position in a register or to a vector where a value in this position points further in the database, so as to enable dynamic changes to be made in a database, thereby enabling an element to be moved between, e.g., different pages, with subsequent changing of the value in the register for this element, without needing to assign a new hash value to the element.
It is also known that a distributed database will include a plurality of nodes that form a common system in a common database. The information in a distributed database is distributed in the different nodes that are included.
A node can include a primary replica of one or more parts of the database and a secondary replica of one or more other parts of the database. A primary replica with an associated secondary replica is normally referred to as a fragment.
The following publications are considered to provide a comprehensive picture of known technology in this particular field:
“Linear Hashing: A New Tool for File and Tables Addressing”, W. Litwin, reprinted from VLDB-80 in Readings in Database Systems, 2nd ed., M. Stonebraker (ed.), Morgan Kaufman, 1994.
“LH*—Linear Hashing for Distributed Files”, W. Litwin, M.-A. Neimat, D. A. Schneider, ACM SIGMOD Int'l. Conf. on Management of Data, 1993.
“High-Availability LH* Schemes with Mirroring”, W. Litwin, M. A. Neimat.
“LH*LH: A Scalable High Performance Data Structure for Switched Microcomputers”, J. S. Karlsson, W. Litwin, T. Risch, 5th Int'l. Conf. on Extending Database Technology (EDBT '96).
“Dynamic Hash Tables”, P.-A. Larsson, Communications of the ACM, April, 1988.
“Main-Memory Linear Hashing—Some Enhancements of Larson's Algorithm”, M. Pettersson, University of Linköping, Sweden, March, 1993.
SUMMARY OF THE INVENTION
TECHNICAL PROBLEMS
When considering the present state of the art as described above, it will be seen that a technical problem resides in the provision of a method whereby an element can be stored, searched for, and re-found without, in the majority of cases, resulting in a cache-miss or, in certain cases, in solely one cache-miss.
Another technical problem is one of enabling the resolution to be enhanced when searching for an element, so that a search will provide the possibility of reading-in the sought element on the first occasion of reading into the cache-memory, or in exceptional cases on the second read-in occasion.
Another technical problem is one of providing a method whereby a data structure can be formed that makes such resolution and searching possible.
Another technical problem is one of enabling all replicas of an element belonging to a fragment, such as primary and secondary replicas, to be found when searching the element.
Another technical problem is one of devising a storage structure format, here designated container, that is able to store a plurality of elements and that can also be read into a cache memory in its entirety.
A related technical problem is then one of enabling the storage of an element which, according to the mathematical function, shall be stored in a container that is already full.
Yet another technical problem resides in enabling the storage of an element which, according to the mathematical function, shall be stored on a page that is already full.
Still another technical problem is one of providing simple and effective handling of available page space for the storage of containers that grow or shrink in size as a result of adding further elements thereto or removing existing elements therefrom.
Another technical problem is one of enabling the effective use of an available memory where many containers does not fill the allocated memory space.
In this regard, a technical problem is one of enabling this use when respective containers increase or decrease in size with time.
Thus, another technical problem is one of enabling the double-use of a space that is adapted for a container having a specific largest permitted size, i.e. a maximum container.
Yet another technical problem is one of providing a container format that provides necessary information as to whether or not all elements that shall be present within the container according to the mathematical function are in fact actually present, and information as to where el
Dabaghi Ataullah
Ronstrom Ulf Mikael
Burns Doane Swecker & Mathis L.L.P.
Shah Sanjiv
Telefonaktiebolaget LM Ericsson
LandOfFree
Method of storing elements in a database does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method of storing elements in a database, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method of storing elements in a database will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2616091