Deterministic search algorithm

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

06625612

ABSTRACT:

FIELD AND BACKGROUND OF THE INVENTION
The present invention relates to computer-based information storage and retrieval systems, to the use of hashing techniques in such systems, and more particularly, to the use of hashing techniques in networking applications.
Information or data stored in a computer-controlled storage mechanism can be retrieved by searching for a particular key in the stored records. The stored record with a key matching the search key is then retrieved. Such searching techniques require repeated accesses or probes into the storage mechanism to perform key comparisons. In large storage and retrieval systems, such searching, even if augmented by efficient search algorithms such as a binary search, often requires an excessive amount of time.
Another well-known and much faster method for storing and retrieving information from computer store involves the use of so-called “hashing” techniques. These techniques are also sometimes called scatter-storage or key-transformation techniques. In a system using hashing, the key is operated upon by an operator to produce a storage address in the storage space. The operator is called a hashing function, and the storage space is called a hash table. The storage address is then used to access the desired storage location directly with fewer storage accesses or probes than sequential or binary searches. Hashing techniques are described in the classic text by D. Knuth entitled The Art of Computer Programming, Volume 3, Sorting and Searching, pp. 506-549, Addison-Wesley, Reading, Mass., 1973, and more recently, in the contemporary classic text of R. Sedgewick entitled “Algorithms in C++”, pp. 231-243, Addison-Wesley, Reading, Mass., 1992.
Hashing functions are designed to translate the universe of keys into addresses uniformly distributed throughout the hash table. Typical hashing operations include truncation, folding, transposition and modulo arithmetic. A disadvantage of hashing techniques is that more than one key can translate into the same storage address, causing “collisions” in storage or retrieval operations. Some form of collision-resolution strategy must therefore be provided. For example, the simple strategy of searching forward from the initial storage address to the first empty storage location will resolve the collision. This technique is called linear probing. If the hash table is considered to be circular so that addresses beyond the end of the table map back to the beginning of the table, then the linear probing is done with “open addressing,” i.e., with the entire hash table as overflow spare in the event that a collision occurs.
An alternative to linear probing is a technique commonly referred to as “double hashing” or “multiple hashing”. When more than one key translates into the same storage address using the first hash function, the collision can be resolved by selecting a different hash function and “rehashing” those keys (that had returned identical results using the first hash function) in order to differentiate between them. Of course, there is a finite probability that more than one key will translate into the same storage address using the second hash function, in which case the new collision can be resolved by selecting a (different) third hash function and “rehashing” those keys once again in order to differentiate between them. This process can be repeated until all collisions have been resolved. According to Sedgewick, double hashing uses fewer probes, on the average, than linear probing. Sedgewick cites several examples of improved hashing methods, but cautions against
‘premature use of advanced methods except by experts with serious searching applications, because separate chaining and double hashing are simple, efficient, and quite acceptable for most applications.’
One area in which multiple hashing is less effective or even problematic is network applications. Although the average speed is an important parameter in such applications, a more important and often overriding requirement is a highly predictable, deterministic operation. For example, voice and video recordings can be transmitted as data via the Internet using a digital data channel. The Internet network utilizes routers to direct the data from the sending address to the destination address. Routers using multiple hashing routines to locate the destination address and deliver these data packets will have a characteristically high variance in the time required to locate the address. In most cases, typically about 70%-80% of the time, the multiple hashing technique will locate the destination address in the first memory access. However, in about 20%-30% of the time, a second memory access is required. Often, a third, fourth or fifth memory access is required in order to locate the address. Moreover, in the case of voice transmission, a high variance of this kind results in a broken up, non-uniform sound message. These disturbances are often referred to as “jitters”.
There is therefore a recognized need for, and it would be highly advantageous to have, a high-performance method for directly referencing records in a table that is more deterministic and has appreciably less variance than existing multiple hashing techniques, a method that both provides fast access to the memory and is efficient with regard to memory space.
SUMMARY OF THE INVENTION
The present invention is a search method for storing and retrieving information records that is fast, efficient with regard to memory usage, and deterministic.
According to one aspect of the present invention there is provided a method for storing a key using a hash table, the method comprising the steps of: (a) hashing the key using a first hash function, thereby transforming the key into a table address in the hash table; (b) hashing the key using a second hash function, distinct from the first hash function, thereby transforming the key into a key signature; and (c) entering the key signature in a signature page located at the table address.
According to further features in preferred embodiments of the invention described below, the method for storing and retrieving information records further comprises the step of: (d) entering a pointer in the signature page, the pointer corresponding to the key signature and pointing to the key.
According to another aspect of the present invention there is provided a method for retrieving a tabulated key using a search key, comprising the steps of: (a) providing a plurality of signature pages, each of the plurality of signature pages having a table address and space for at least one key signature, each of at least one key signature having (i) a corresponding tabulated key, and (ii) an associated key pointer pointing to the corresponding tabulated key; (b) hashing the search key using a first hash function, thereby transforming the search key into a table address; (c) hashing the search key using a second hash function, distinct from the first hash function, thereby transforming the search key into a search key signature, and (d) searching the plurality of signature pages for a signature page having the table address of step (b) for at least one match between the search key signature and at least one key signature.
According to further features in preferred embodiments of the invention described below, the method for storing and retrieving information records further comprises the step of: (e) if the search for the at least one match succeeds: searching for a match between the search key and the corresponding tabulated key.
According to another aspect of the present invention there is provided a system for storing and retrieving data using a search key, the system comprising: (a) a data storage module having a plurality of tabulated signature pages, each of tile tabulated signature pages having a table address and at least one key signature having an associated pointer, the pointer pointing to a corresponding key in the data storage module, and (b) a processor that is operative to perform operations including: (i) hashing the search key using

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Deterministic search algorithm does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Deterministic search algorithm, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Deterministic search algorithm will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3068867

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.