Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
2000-09-05
2003-07-01
Mizrahi, Diane D. (Department: 2175)
Data processing: database and file management or data structures
Database design
Data structure types
Reexamination Certificate
active
06587852
ABSTRACT:
FIELD OF THE INVENTION
The present invention concerns a processing circuit for recognition and comparison of complex patterns in high-speed data streams, particularly for use in search engines for search and retrieval of data stored in structured or unstructured databases, and wherein the processing circuit forms a node in a network of such processing circuits: and a search processor circuit comprising a multiprocessor unit P
n
with tree structure for recognition and comparison of complex patterns in high speed data streams, particularly for use in search engines for search and retrieval of data stored in structured or unstructured databases, wherein the multiprocessor unit P
n
comprises processing circuits P
1
according to claim 1, and wherein the multiprocessor unit P
n
forms a circuit realized as a binary or superbinary tree with n+1 levels S
0, S
1
. . . S
n
and decree k=2
m
, wherein m is a positive integer larger or equal to 1, and a superbinary tree defined by k>2.
The present invention realizes a search processor circuit on the basis of a multiprocessor unit with a plurality of processing circuits. In a search operation a stream of data is carried in steps through the processing circuits of the search processor circuit and for each ster the data which currently are in the chip, are compared with some pattern or other which may be coded as a bit string and beforehand input to the processing circuits.
DESCRIPTION OF RELATED ART
For searching information in large data streams, something which for instance will be typical for search engines for data communication networks of a type such as Internet or Intranet, for monitoring the content of data streams and for retrieval of data in large structured and unstructured databases, there have in recent years been developed dedicated processors. The reason is that recognition and retrieval of information in the above-mentioned fields are critical operations which do not lend themselyes to an effective realization with the use of common data processors. Basically search and retrieval of pattems in large data volumes are suited to massively parallel solutions using a large number of processing elements which simultaneously search in the same or different data segments. By use of a massive parallelism it will be possible to handle a large number of queries or searches simultaneously. For the search it has been proposed to use special search languages which have sufficient power of expression to describe features of the information searched.
In the art there are known processors which use a comparison of strings of data or symbols. As an example of prior art in that connection there may be referred to international patent application No. PCT/NO92/00173 with the title “Non-numeric coprocessor”. Further has the company Paracel Inc. developed a data processing unit called Fast Data Finder (FDF) which is particularly adapted for analysing similarities between data. FDF uses a pattern matching technology which can detect an exact match, but may also be able to find distant similarities, something which may be useful in gene research and text searching.
Further there are from U.S. Pat. No. 5,553,272 (Ranganathan & al.) known a linear systolic array processor for computing the edit distance between two strings over a given alphabet. This computation is based on a coding scheme which reduces the number of bits which are necessary to represent a state in the computation. The systolic array processor has been given an architecture which does not constrain the length of the strings which can be compared and uses simple basic cells which only need to communicate with the nearest cell neighbour, such that it is ver well suited for VLSI implementation.
A disadvantage with the known dedicated search processors is that they do not offer a sufficiently advanced functionality to handle very complex search queries. Another disadvantage is that they to a greater extent are based on a circuit architecture which only with difficulty can offer a functionality of this kind without being unnecessarily complicated.
An attempt to solve this problem is known from U.S. Pat. No. 4,860,201 (Stolfo & al.) which discloses a parallel processing device structured as a binary tree, where a large number of processors, each with its own I/O unit, are used. Generally Stolfo & al. discloses a computer with a large number of processors connected in a binary tree structure such that each processor apart from those which form respectively the root and the leaves of the tree, has a single parent processor and two child processors. The processors work typically synchronously with data which are transmitted to them from a parent processor and communicate the results further to the nearest following processors. At the same time the child processors of a parent processor can also communicate with each other. According to Stolfo & al. each node forms a processing element which comprises a processor in a real sense, a read/write memory or a random access memory and an I/O means. The I/O means provides interfaces between each processing element and its parent and child processing elements such that a substantial improvement is obtained in the speed whereby data are transmitted through the binary tree structure. As the binary tree structure has a processing element in every node, the processing device generally will comprise 2
n
−1 processing elements, i.e. 1023 processing elements if the binary tree is realized with n=10 levels. In a preferred embodiment this prior art parallel processing device has a clock frequency of 12 MHz which in case a tree with 1023 processors is used, each processor having an average instruction cycle time of 1,8 &mgr;s, offers a processing performance of about 570 million instructions per second.
A binary parallel processor of this kind can be well suited for handling partitionable data processing problems, for instance searching in large information volumes. A partitionable search problem can be defined as a problem where a query about relation between an object x and and object set corresponds to a repeated use of a commutative and associative a binary operator b which has an identity, and a primitive search query q which is applied between a new object x and each element f in the set F. One has then a partitionable search problem when the logic function OR is combined with the primitive query is “x equal to f” applied between the object x an each element f in F. As stated in Stolfo & al., a problem which consists of answering a query about a set F, can be answered by combining the queries applied to arbitrary subsets of F. The problem is in other words partitionable and well-suited for rapid execution by parallel processing. The set F is partitioned in a number of arbitrary subsets equal to the number of available processors. The primitive query q is then applied in parallel on each processor between the unknown x which is communicated to all processors and the locally stored element f in the set F. The result is then arbitrarily combined in parallel by log
2
N repetitions of the operators b, a number of computations on N/2 adjoining pairs of processors first being performed and then a corresponding number of computations on N/4 pairs of processors with the results from the first computations. The operations are thus moving during the processing to overing levels in the binary tree, in other words from child processors to the parent processor etc. and are repeated in parallel on each level.
SUMMARY OF THE INVENTION
From international patent application PCT/NO99/00308 which belongs to the present applicant, there is known a digital processing device which is suited for processing digital data signal structures, where the data signal structures comprise repetitive sequences and/or nested patterns. This processing device is generally configured as a regular tree with n+1 levels S
0
, S
1
. . . S
n
, and of degree k. This architecture provides a number of advantages compared with that which is disclosed in the above-m
Birkeland Olaf
Halaas Arne
Svingen Børge
Birch & Stewart Kolasch & Birch, LLP
Fast Search & Transfer ASA
Mizrahi Diane D.
Mofiz Apu
LandOfFree
Processing circuit and a search processor circuit does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Processing circuit and a search processor circuit, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Processing circuit and a search processor circuit will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3083441