Method of and apparatus for matching strings of different...

Multiplex communications – Pathfinding or routing – Switching a message which includes an address header

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C370S401000

Reexamination Certificate

active

06614789

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to comparing and sorting data strings, and in particular, to comparing and sorting data strings of different lengths, such that the data strings can be queried using tree structures. Specifically, the invention relates to performing this process on data strings of different lengths that may be prefixes of each other.
BACKGROUND OF THE INVENTION
Data matching, and in particular, prefix matching is known and applied to various applications. In general, a database search is performed for data strings which are associated with a given input string or key. The association between the input string and the data strings, which is the search criteria, depends on the particular application. The particular search may require locating the longest, shortest or all data strings which are a prefix of a query string. The applications in which such matching is useful are numerous and, in particular, include layer
3
and layer
4
switching in TCP/IP protocols, directory lookup in a telephone context, on-line dictionaries and spell checkers, to name just a few.
The prefix matching problem constitutes the essential part of some applications in the computer realm and related area. The assumption in the prior art relating to these applications is that there are strings of an alphabet &Sgr; which are ordered. The strings can have different lengths and can be prefixes of each other. The data strings are stored in a database along with other associated data.
A user may want to find the longest, smallest or all strings which are a prefix of a query string. In other applications, a user may be interested in finding all the data strings, such that a given input string is a prefix of them. It is very important to respond to any such query in a reasonable amount of time and in as efficient a manner as possible. Each application may have its own alphabet set and the number of characters in the alphabet handling these queries determines the complexity of the search.
The number of hosts on the Internet grows rapidly everyday. New data intensive applications such as multimedia, hypertext data, video conferencing, remote imaging, etc., cause the data traffic to explode. These applications demand higher bandwidth on the communication line and faster and more efficient computer networks. To keep up with these demands and the traffic, the speed of communication lines has been increased to several gigabits per second in the last few years. As a result, routers must forward IP packets more efficiently. Routers search the Internet Protocol (IP) routing tables to find the address of the next hops (or hubs) to which the packet is to be forwarded on the path towards the final destination. Each router has its own routing table consisting of pairs of prefixes of networks addresses and their corresponding hops. The routers usually must determine the longest matching network prefix with a packet destination address and take the corresponding hop. Finding the next hop for each packet becomes harder and harder because the increasing number of hosts on the Internet expands the global network and increases the number of hops to go through. Therefore, the size of the routing table grows accordingly. Increasing the speed of data links helps to shorten the time to send a packet. Advances in the semiconductor technology improve the processing capability of CPU chips and can help reduce the time of the table lookup. However, because the link speed grows faster than the processing speed, and the size of data is growing also, the IP lookup problem is resulting in a serious bottleneck on the information superhighway. The alphabet in this application is very limited (only {0,1}), however the problem is very challenging.
The IP lookup or layer
3
switching is not the only application of prefix matching of the {0,1} alphabet in routers. Internet Service Providers (ISPs) like to provide different services to different customers. Some organizations filter packets from the outside world by installing firewalls in order to deny access to unauthorized sources. Supporting this functionality requires packet filtering or packet classification mechanisms in layer
4
of TCP/IP protocols. Forwarding engines must be able to identify the context of packets and classify them based on their source and destination address, protocols, etc., or on all of this information. This classification must be performed at the wire speed. Routers attempt to handle this by keeping a set of rules which applies to a range of network addresses. Therefore, again we encounter the prefix matching problem in two dimensional space; i.e., for source and destination addresses of a packet.
Applications of prefix matching are not restricted to layer
3
and
4
switching. Some other useful applications include directory lookup in a telephone context, on-line dictionaries, spell checkers and looking up social security numbers. U.S. Pat. No. 5,758,024 discloses the prefix matching problem relating to computer speech recognition and proposes a compact encoding pronunciation prefix tree scheme. A method to improve the parsing process of source codes which use prefix matching is also disclosed in U.S. Pat. No. 5,812,853. The approach in this disclosure identifies the previously-parsed prefixes of a source, creates parsers in the parser states corresponding to the identified prefix and parses the remaining portion of the translation unit. Finally, U.S. Pat. No. 4,464,650 discloses an apparatus and method using prefix matching in data compression. Data compression is crucial in database applications as well as in data communication. The patent includes parsing the input stream of data symbols into the prefix and data segments, and using the previously longest matching prefixes to compress the data.
Traditionally, the prefix matching search has been performed by the Trie structure. A trie is based on the “thumb-index” of a large dictionary in which a word can be located by checking consecutive letters of a string from the beginning to the end. A trie is essentially an m_way tree whereas a branch in each node corresponds to a letter or character of alphabet &Sgr;. A string is represented by a path from the root to a leaf node. The trie structure may be modified and applied to all of the applications discussed above. In some applications, for example in the longest prefix matching IP lookup context, researchers have been able to handle the problem in some more subtle ways than the trie structure, due in part to the limited number of characters in the alphabet. These methods do not have the generality or broad applicability of the trie structure. The main problems with trie structures are its inflexibility; i.e. the number of branches corresponds to the number of characters, and having additional blank nodes as place holders. Furthermore, in general, the search time is proportional to the length of the input strings.
Patricia Trie modified the binary trie by eliminating most of the unnecessary nodes and the modification is the basis of several new methods that have been proposed in the last several years. These approaches attempt to check several characters, or several bits, at each step, instead of checking only one character. Because checking several characters may deteriorate memory usage and leave many memory spaces unused, all of these approaches try to minimize the memory waste. V. Srinivasan and G. Varghese, in “Fast Address Lookups using Controlled prefix”, Proceedings of ACM Sigmetrics, Sep. 1998 proposed to expand the original prefixes (strings) into an equivalent set of prefixes with fewer lengths, and then, apply a dynamic programming technique to the overall index structure in order to optimize memory usage. Other methods proposed a specific case wherein local optimization of memory usage was applied in each step. This is the case in S. Mission and G. Karlsson's, “Fast Address Look-Up for Internet Routers”,Proceedings of IEEE Broadband Communications 98, April. 1998. Finally, a new scheme from Lu

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method of and apparatus for matching strings of different... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method of and apparatus for matching strings of different..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method of and apparatus for matching strings of different... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3089191

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.