Method for searching a database, search engine system for...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

06691103

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to methods for searching a database and, more particularly, to methods for matching textual strings and searching a relational database, such as a name/address database. The invention also relates to a search engine system for searching a database. The invention further relates to a method of providing a key table for use by a database search engine.
2. Background Information
Many software applications make use of a database containing information about individuals. This information usually includes the person's name, address, employer, and other information. Examples of such databases include marketing databases, databases of current customers, such as those maintained by a video rental store or a long distance telephone service provider, and law enforcement databases. These applications almost always provide a mechanism for accessing the information stored about a particular person based on one or more key attributes of that person. For example, a clerk at a video store will most likely be able to access an individual's account information based only on the person's telephone number or social security number. If such key information is available, then obtaining the desired database record is straightforward and current database technology permits highly efficient retrieval.
Software applications have arisen, however, in which correct, unambiguous information about the database entry being sought is not available. For example, law enforcement maintains vast databases of incidents, cases, and individuals. The information available to law enforcement and entered into these databases is very often less than absolute. For example, a database may contain information about an individual known only as John Smith of Elm Street. A user of the database may be interested in finding any entry involving Jonathan E. Smyth of 304 Elm Circle. Under straightforward database query techniques, a linkage between the two entities will not be observed because there is no exact match. Therefore, techniques have been produced to query the database for sets of records that are in some way similar to the search target. These sets are usually presented to a human user who makes the determination as to whether one of the records in the set is indeed the record being sought.
Many techniques exist to query a database, list, directory or index for a set of records that are similar to, or close to, a query target under some distance metric. For example, there is the case of constructing a set of records that differ from the query target only by the transformation of a Character Transposition. Character Transposition is a transformation in which the order of two consecutive characters is reversed. If two entries can be made to match with one application of the Character Transposition, then they could be said to be a distance of one apart. Similarly, if two entries can be made to match with two Character Transpositions, then they are said to be a distance of two apart. Hence, “Reid” and “Ried” are a distance of one apart, and “Farmer” and “Framre” are a distance of two apart.
Generally, an algorithm of this type will allow the use of a number of different transformations and compute a distance metric based on the type of transformations required, the number of transformations required, and sometimes the order in which the transformations are required, in order to compute an overall distance between a specific entry and the query target. Transformations include those generally directed at correction of typographic errors, such as Character Transposition, Character Insertion, Character Deletion, and Character Substitution. Common Transformations also include those directed at phonetic issues such as the Soundex method (e.g., a “sounds-like” search, in which a search string is tested against database records for similarity in sound), applications of Kenyon and Knott phonetic transformations, and unigram, bigram, trigram, and multigram approaches. Still other transformations exist which are directed at particular types of errors or deception attempts.
While these distance-based approaches have demonstrated good results for specific types of noise, or combinations of noise, they are usually computationally intensive over the types of variations seen in many name search applications. The computational complexity limits the usefulness of many of these approaches for real-time applications by making the algorithm unsuitably slow. Further, few, if any, of the known distance-metric search techniques preserve the substantial power inherent in Structured Query Language (SQL) over an indexed, relational database.
It is, therefore, reasonable to search for new techniques that preserve the quality of matches obtained by computationally intensive distance metric calculations while leveraging the power of SQL over an indexed database.
There is, therefore, room for improvement in database searching techniques and systems.
SUMMARY OF THE INVENTION
These needs and others are met by the present invention, which achieves high efficiency in constructing a set of database entries that potentially match a search target. In particular, the invention may be employed to provide matching against a name/address database.
As one aspect of the invention, a method for searching a database including a plurality of records, with at least some of the records having a plurality of record fields and a plurality of record elements, comprises: receiving a search criteria including a plurality of search elements corresponding to at least some of the record elements of the database, each of the search elements being capable of returning one or more corresponding search results from the records of the database; ordering the search elements of the search criteria based upon an expected size of the corresponding search results from the database; and searching the database with one of the search elements, which is expected to provide a first group of the search results, before searching the database with another one of the search elements, which is expected to provide a second group of the search results, the second group being larger in size than the first group.
The method may further comprise constructing a search priority array including a plurality of records, with each of the records of the search priority array having a plurality of fields; employing a search priority field as one of the fields of the records of the search priority array; and calculating the search priority field for each of the records of the search priority array. The method may further comprise employing an array of replacement words including a plurality of records; employing with each of the records of the array of replacement words an original word, a replacement word and a priority constant; and for at least some of the search elements of the search criteria, determining if one of the at least some of the search elements corresponds to one of the replacement words of the array of replacement words and responsively employing a corresponding one of the priority constants of the array of replacement words in the step of ordering the search elements of the search criteria.
The method may further comprise parsing each of the search elements of the search criteria. At least one word is employed for some of the search elements of the search criteria; a suffix table is employed including a plurality of first suffixes and a plurality of corresponding replacement suffixes; and it is determined if any of the words ends in one of the first suffixes of the suffix table and, if so, the one of the first suffixes is responsively replaced with the corresponding replacement suffix.
The method may further comprise employing a string reduction table including a plurality of first character strings, a plurality of corresponding conditions and a plurality of corresponding replacement character strings; and recursively searching for one of the first character strings in any of the words, determining if th

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method for searching a database, search engine system for... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method for searching a database, search engine system for..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for searching a database, search engine system for... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3278283

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.