Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
1999-07-22
2002-04-23
Alam, Hosain T. (Department: 2172)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000, C707S793000, C709S241000, C711S163000, C712S228000
Reexamination Certificate
active
06377942
ABSTRACT:
BACKGROUND TO THE INVENTION
This invention relates to a method and apparatus for performing multiple string searching.
A string is defined herein as a sequence of symbols from an alphabet. For example, a string may be a text string, formed from a sequence of ASCII characters. As another example, a string may comprise a DNA sequence, based on the four-symbol DNA “alphabet”.
The string-search problem is to find the first (or all) occurrences of one or more key strings (or “patterns”) within a target string (or “text”).
The simplest case is where there is just a single key string. The single string search problem has been widely studied in the Computer Science community. A good survey of this problem is available on the World Wide Web at: http://www.dir.univ-rouen.fr/~charras/string/string.html.
The present invention is concerned with the problem of multiple string searching, that is, of finding occurrences of a plurality of key strings within a target string. The following references relate to the multiple string search problem:
Aho, A. V., and M. Corasick, “Efficient String Matching: An Aid to Bibliographic Search,” CACM June 1975, Vol.18, No.6
Commentz-Walter, Beate, “A String Matching Algorithm Fast on the Average,” Technical Report, IBM-Germany, Scientific Center Heidelberg, Tiergartenstrasse 15, D-6900 Heidelberg, Germany
Haertel, Mike, “kwset.c” (part of the GNU grep command), http://www.gsi.de/gnu/grep-2.0/
The object of the present invention is to provide a novel solution to the multiple string search problem.
SUMMARY OF THE INVENTION
According to the invention a data processing system comprises searching means for finding occurrences of a plurality of key strings within a target string, wherein the searching means comprises:
(a) means for forming a hash value from each of the key strings, and for adding each key string to a collection of key strings having the same hash value;
(b) means for selecting a plurality of symbol positions in the target string;
(c) means for forming a hash value at each selected symbol position in the target string and for using this hash value to select one of the collections of key strings; and
(d) means for comparing each key string in the selected collection of key strings with the target string.
REFERENCES:
patent: 4525780 (1985-06-01), Bratt et al.
patent: 4922417 (1990-05-01), Churm et al.
patent: 4961139 (1990-10-01), Hong et al.
patent: 4996663 (1991-02-01), Nemes
patent: 5202985 (1993-04-01), Goyal
patent: 5281967 (1994-01-01), Jung
patent: 5287499 (1994-02-01), Nemes
patent: 5926807 (1999-07-01), Peltonen et al.
patent: 6202098 (2001-03-01), Chow
patent: 6226267 (2001-05-01), Spinney et al.
Hinsley Stewart Robert
Le Lohé Léon Richard
Alam Shahid
International Computers Limited
LandOfFree
Multiple string search method does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Multiple string search method, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Multiple string search method will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2918867