Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
2003-02-18
2004-04-27
Robinson, Greta (Department: 2177)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000, C707S793000
Reexamination Certificate
active
06728732
ABSTRACT:
BACKGROUND OF THE INVENTION
This invention relates to a method for storing and managing data. More particularly this invention relates to a data structure representing a set of prefixes and a method for searching the data structure. The data structure and the search method are particularly advantageous for implementation in digital computer hardware. The primary application of current interest is to semiconductor integrated circuits used for packet classification in high-speed, multiservice Internet routers. However, the technique may be useful in a variety of applications involving data that needs to be prioritized or wherein structure in the data needs to be determined and then to be classified. As a result of classification of the data, action on data can be taken more quickly and efficiently.
In the past, in the packet-switched networking field, Internet-Protocol (IP) traffic handling processes have required relatively complex searches and analyses of packet header information to develop routing and processing instructions. Recently, IP destination address lookup issues relating to techniques for routing a packet to a destination have received a great deal of attention from the networking community. The following references may be consulted on the state of the art related to the invention herein described:
[1] M. Degermark, A. Brodnik, S. Carlsson, S. Pink, “Small Forwarding Tables for Fast Routing Lookups” Proc. ACM SIGCOMM '97, Cannes (Sep. 14-18, 1997) (describing the Lulea algorithm).
[2] V. Srinivasan, G. Varghese, “Faster IP Lookups using Controlled Prefix Expansion,” Proc. SIGMETRICS '98.
A number of patents of interest on lookup algorithms are known in the art.
It is helpful to understand the background of the closest known method for searching tree structures of data. Note the following definitions:
A trie is a tree structure organized for searching.
A trie element is that portion of a tree structure at a single node.
A prefix is a string of characters that appears at the beginning of a longer string of characters.
A stride is a number of levels in a tree accessed in a single read operation.
A target string is a string of characters to be classified.
In many cases of practical interest the characters in a prefix are binary digits, i.e., ones and zeroes. In the example prefix database
10
in
FIG. 1
there are nine binary prefixes
11
-
19
each terminated by *, a symbol which represents the remaining arbitrary binary digits in a longer, fixed-length string. If the total length of the string is k digits, then P
1
represents any string of k arbitrary binary digits. Similarly, P
3
represents a k-bit string that begins with 00 and is followed by any k-2 length string of binary digits.
A prefix database can be used to classify strings of characters into subsets according to which prefix a particular string matches. The prefix database structure can include exact matches with a target string, i.e., the data to be analyzed can appear in the database.
The matching process is ambiguous unless the classification requires that the matching prefix be the longest possible match in the database. For example, in
FIG. 1
, the 10-bit string 1000011000 matches P
1
, P
2
, P
6
and P
9
, but only P
9
is the longest, matching the first 7-bits of the 10-bit string.
A database of binary prefixes can be represented by a binary tree
20
specialized for searching. This search tree
20
is an example of a data structure called a trie and is convenient for finding the longest prefix match. Nodes (shown as small circles in
FIG. 1
) in the trie are labeled as Prefix Nodes or Vacant Nodes according to whether they represent a prefix or not.
The search for a match to a target string progresses by examining one bit of the target at a time. Before examining the target's first bit, a match with the Prefix Node
21
for P
1
is known to exist, since P
1
matches all strings. If the first bit is a 0, take the left branch
22
of the trie. If the first bit is a 1 take the right branch
23
of the trie. In either case, determine if a Prefix Node is encountered. In the example of
FIG. 1
taking right branch
23
leads to a match with the Prefix Node
25
for P
2
.
The process proceeds by taking the left or right branch according to whether the next bit is a 0 or a 1, noting the Prefix Nodes encountered and tracing a path down the trie until either a final node
26
-
30
is reached (a leaf) or there is no longer a path on the trie (since the node is not in the database). In either case, the last Prefix Node found represents the longest prefix match.
The Lulea algorithm is the algorithm known to encode prefix tables most compactly. The Lulea algorithm is a scheme that uses multibit trie elements containing bitmaps to minimize wasted storage. The Lulea scheme does well in terms of worst case storage, using only 200 Kbytes of memory to store the current MAE East database in a fully structured form. (The MAE East database is a database that contains over 40,000 Internet prefixes for IP addresses and requires over 900 Kbytes of ASCII character storage in its uncompressed form.) One way to describe the Lulea scheme is to consider the database
10
of
FIG. 1
remapped to the multibit trie elements
102
,
104
,
106
,
108
as in FIG.
2
A.
The various possible target strings are shown in
FIG. 2A
as 3-bit strides at the left of each of the trie elements
102
,
104
,
106
and
108
. The longest matching prefix at each stride is shown in the prefix column of the trie elements in FIG.
2
A. For example, the target string 111001 traces a path down the right hand side of the trie
20
in
FIG. 1
to the leaf
30
and the prefix P
8
. In
FIG. 2A
the leading 111 portion of the target string selects the last entry in the trie element
102
which contains a pointer to trie element
108
where the last three bits of the target string 001 select the prefix P
8
.
This remapped trie
100
is called a controlled prefix expansion and is subject, in
FIG. 2B
, to an optimization known as “leaf pushing.” Leaf pushing is intended to reduce memory requirements by making each trie element entry contain EITHER a pointer OR a prefix label but not both. Thus entries like “100” (binary) in the fifth row of the root trie element
102
of
FIG. 2A
which have both a pointer and a prefix label must be modified. The optimization pushes the prefix label down to the vacant leaves
124
,
125
,
126
,
127
of the trie element
104
. Since these leaves have no pointer, there is room to store a prefix label. This is shown in
FIG. 2B
by pointer
205
pointing from trie element
102
′ to trie element
104
′, which has P
2
pushed down into the previously empty slots of trie element
104
′, and by pointer
208
, which has P
5
pushed down into the previously empty slots of trie element
108
′. The occupied leaves
120
,
121
,
122
,
123
do not receive a prefix because P
6
is a longer match than P
2
and therefore the shorter match can be ignored.
Conceptually the Lulea scheme starts with a leaf-pushed trie
101
and replaces, with a single value, all the consecutive entries in a trie element that have the same value. This can greatly reduce the amount of storage required in a trie element. To allow trie indexing to take place even after the elements have been compressed, a bit map with 1's corresponding to retained positions and with 0's corresponding to the removed positions is associated with each compressed element.
For example consider the partially compressed root element
102
′ in FIG.
2
B. After leaf pushing the root element
102
′ has the sequence of values P
3
(
201
), P
3
(
202
), P
1
(
203
), P
1
(
204
), ptr
1
(
205
), P
4
(
206
), P
2
(
207
), ptr
2
(
208
) (where ptr
1
205
is a pointer to the trie element
104
′ and ptr
2
208
is the pointer to the trie element
108
′ ). Referring to
FIG. 2C
, this sequence of values is compressed by removing redundant values and indicating by the placement of 1's in a
Dittia Zubin
Eatherton William N.
Black Linh
Robinson Greta
The Law Office of Kirk D. Williams
Washington University
LandOfFree
Data structure using a tree bitmap and method for rapid... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Data structure using a tree bitmap and method for rapid..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Data structure using a tree bitmap and method for rapid... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3223155