Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
1998-03-19
2001-02-06
Homere, Jean R. (Department: 2777)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000, C707S793000, C707S793000, C707S793000
Reexamination Certificate
active
06185552
ABSTRACT:
BACKGROUND OF THE INVENTION
Data conversion tables frequently take the form of large tables of data having ordered index or “key” entries, each associated with a respective data entry or entries. Each key entry is for storing a key value, and each data entry is for storing a data value. Such a table is illustrated in
FIG. 1
, wherein the table is comprised of 1 to n key entries, each having an associated data entry. Thus, key entry
5
has a key value of A
458
and an associated data entry has the data value of
1005
(key and data values being given in hexadecimal format unless otherwise noted).
A search value is used to return a desired data value from the table. The search value is compared to the key values in a specified order, and the desired data value is retrieved when a key value is located which correlates to the search value.
Tables such as illustrated in
FIG. 1
are typically provided from a single physical memory element to simplify data retrieval and table maintenance. In this case, the key entries and the data entries are said to be “unified.” Operations affecting the table such as insertions and deletions are referred to “atomic operations” since both a key value and a data value or values are affected at the same time. Such tables have found wide use, such as for data unit address conversion in a variety of settings and data management in database applications.
Sequential processing of a unified data structure is possible, where a search begins at an upper or lower bound of the table and progresses until the desired key value is located or until the opposite bound is reached. The associated data value(s) is then retrieved if a match occurs. This approach is commonly referred to as a linear search.
Linear searching is impractical for large tables (i.e. tables having a relatively large number of entries) due to the length of time necessary to compare or “probe” a search value with the key value at each location. It is possible to suspend linear searching of an ordered table should the key values being compared go beyond the search value, indicating that the searched for value is not in the list. However, given an even chance of successfully finding any one of the key values, on average, half of the key values must be analyzed in performing a linear search.
To improve upon the performance of a linear search, ordered lists are commonly probed using a technique referred to as a binary search, a dichotomizing search in which the table to be searched is divided at each iteration into two equal, or nearly equal, parts. For instance, in an ordered list of twelve entries, a search for a particular search value begins at the sixth key entry. If the key value associated with the sixth key entry of the table is larger than the search value, the lower half of the table contains the searched for key value, if the key value is in the table at all. Otherwise, the key value is in the upper half of the table, if at all. Obviously, the search terminates if the value associated with the sixth key entry equals the search value. The divide by two and compare approach is repeated until the search value is matched to a key value, or it is established that there is no key value equivalent to the search value and the search is aborted.
Unified data structures such as in
FIG. 1
are rarely static. Insertions in an ordered table can be performed by overwriting an existing set of key and data values with new values, or by executing a “shuffle up” operation. Deletions are performed by overwriting an existing set of key and data values with a new, dummy set of key and data values, or by executing a “shuffle down” operation. “Up” and “down” are relative directions based upon the ordering of the table.
The problem with inserting by overwriting lies in that it may be desired to keep the overwritten set of key and data values in the table. Thus, the overwritten pair must be rewritten in the next, higher set of entries. The process is repeated until a key/data value set is identified which does not need to be maintained in the table.
Alternatively, the shuffle up approach illustrated in
FIG. 3
is employed for key and data value insertion. Here, it is desired to insert key value A
457
and associated data value B
321
into the first illustrated data structure. Key value FFFF may be regarded as a key value which is undesired, having any associated data value xxxx. In the second illustration, unified key and data values BCA
2
/
0105
are copied into the next higher entries, formerly retaining “FFFF/xxxx”. In the third illustration, unified key and data values A
458
/
1005
are copied into the entries formerly occupied by BCA
2
/
0105
. Finally, the new values A
457
/B
321
are written into the entries formerly for A
458
/
1005
. The proper insertion point is assumed to have been identified by a binary search such as described previously.
A shuffle down operation for the purpose of removing a key/data value pair from the unified data structure is illustrated in FIG.
4
. Here, it is desired to remove key value A
457
and associated data value B
321
. In essentially the reverse of the shuffle up process described with regard to
FIG. 3
, key/data values A
458
/
1005
are overwritten into the entries formerly occupied by the values to be removed. BCA
2
/
0105
take the place of A
458
/
1005
, and a new key/data value pair, FFFF/xxxx, is inserted in the place of BCA
2
/
0105
. FFFF/xxxx may be dummy values, or new values which were being held for an opportunity to be inserted into the table. The shuffle up and shuffle down operations maintain the integrity of the data in the table so that other searches can be performed during insert and delete operations.
The number of entries in the key table can be indicated by a “limit” value or by the key values stored in the table. If a limit value is not used, unused key entries at the “top” of the table should be filled with a value which is greater than the highest currently used key value to enable the binary search. The unused entries can all have the same, greater value, or can have increasing values. Either way, the binary search process will search for desired key values in the correct manner.
A limitation to the typical binary search approach is that it only works for a table such as illustrated in
FIG. 1
comprised of contiguous, ordered key values. If a key value is not ordered with respect to its neighbors, a binary search may be skewed, leading to the failure of the search.
A further drawback associated with unified data tables such as in
FIGS. 1
,
3
and
4
lies in the amount of memory required if plural data entries are associated with each key entry, or if the data entry or entries are relatively long, or both. Larger unified memory implies higher cost. Further, while memory densities continue to increase with time, a practical limitation will always exist on the size of the data entry or entries for a given key entry size, particularly if ASIC based memory arrays are used.
One proposed resolution to the limitations associated with a unified data table is the provision of a distributed data structure, whereby key values are resident in key entries in a discrete memory element, and respective data entries are located in a separate, discrete memory element. Key and data entries are associated through the use of a pointer field stored in conjunction with each key entry.
Shuffle up and down processes are performed on such distributed data structures as well, though atomic updates are still required in order for the data structure to be searched contemporaneously with table updates. Otherwise, it would be possible for a data entry to temporarily have a value not properly associated with a respective key value, for the search process to identify that key value as matching a search value, and for an incorrect data value to be retrieved.
The requirement for atomic updates for a distributed data structure utilizing pointers to associate respective key and data entries introduces problems in synchronizing the key and data value updates. This typica
DeLong Kenneth J.
Heiner, Jr. Edward A.
3Com Corporation
Homere Jean R.
Weingarten, Schurgin Gagnebin & Hayes LLP
LandOfFree
Method and apparatus using a binary search engine for... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus using a binary search engine for..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus using a binary search engine for... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2560213