Database display and search method

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C707S793000, C707S793000, C707S793000

Reexamination Certificate

active

06202068

ABSTRACT:

TECHNICAL FIELD
The present invention relates to the field of computerized database search and display methods. Specifically, the present invention provides a method for searching data, and graphically displaying the search results on a computer monitor.
BACKGROUND
In the past, much effort has been devoted to developing efficient methods of performing computerized research. Recent efforts in this area have been aimed at reducing the time required to accomplish an efficient search, while still providing accurate results. These efforts, however have fallen short of expectations due to the complicated and time consuming calculations required.
In general database searching has been limited to a “boolean” search, whereby the user inputs a specific data or textual item and the search program performs a text by text analysis procedure. Only direct textual matches are retrieved for the user to evaluate. Not only is this type of search inefficient, but large amounts of irrelevant data may also be retrieved due to the nature of the search.
To solve this and other problems associated with a boolean type search, computerized search programs have evolved to search a database for relationships among the data. Once these relationships are identified, they can be quantified as Euclidean distances and output to a display, whereby the distance between the data points represents the magnitude of the relationship between all data points. However, the number of Euclidean distances usually far exceeds the number of degrees of freedom available for mapping the data points into the x-y plane. Therefore, in order to map the data elements to the x-y plane while preserving the Euclidean distances as much as possible, a least squares approach to mapping the points is used. When a large database consisting of n data elements and approximately n
2
/2 conditions to be satisfied is encountered, a least squares approximation requires order n
3
operations. For larger values of n this is a highly computation intensive process which requires a mainframe computer to process the data.
SUMMARY OF THE INVENTION
The invention, as claimed, is intended to provide a solution to the problem of how to provide an efficient and expedient search of an existing database in order to reveal underlying patterns and/or obscure, latent relationships among individual data elements. The present invention, which can be implemented to solve real world problems on an ordinary personal computer, enables the user to visualize extremely large quantities of data in such a manner that secondary and tertiary relationships among the data elements can be readily identified. While methods currently available, such as link analysis, attempt to accomplish the same result, these methods do not scale to large databases, and they are ineffective in attempting to solve problems of the magnitude described below.
The inventive method can be adapted to identify relationships present in many different types of databases. By way of appreciating the breadth of applications for this invention, it is noted that the same underlying techniques can be used to identify criminal activities and criminal organizations by analyzing databases consisting of records of telephone conversations, to identify files of interest for use by prosecutors and investigators involving computers seized by law enforcement agencies, to analyze chemical import and export data to identify illegal drug manufacturing, to analyze financial transactions to detect money laundering activities, to analyze Internet traffic to identify suspected criminal activity and terrorist groups, and to search extremely large databases, such as patent applications or law case books, to find records closely related to topics of research.
The above examples are only a few of the many potential applications for the inventive method.
The inventive method can be divided into four parts:
1. Defining a relationship between data elements.
2. Defining a metric and computing the “distance” between data elements.
3. Mapping the data elements into points in the x-y plane such that “distances” between data elements are preserved.
4. Identifying latent relationships among data elements by viewing the image of the points which have been mapped into the x-y plane.
Defining direct relationships among data elements is usually a straightforward procedure that can be accomplished from the data contained in the given database. For example, in criminal investigations using databases of telephone records, each record contains the telephone number of the caller and the callee. Two telephone numbers are directly related if they are involved in a communication. The strength or intensity of the relationship is determined by the number of communications—the more communications, the stronger the relationship.
In searching through files on seized computers, one can define two computer files as related by the number and type of words (character strings) they have in common. The strength of the relationship depends upon the nature of the common words.
Relationships between countries (or other entities) involved in commerce are defined in terms of whether two entities trade a given commodity. If such trade exists, then the strength of the relationship is determined by the dollar volume of trade.
The approach for analyzing financial transactions and internet traffic are similar to that of the telephone database. Searching extremely large databases, such as patent applications or law case books, is similar in concept to searching through a large number of files on a seized computer.
Defining the metric and computing the “distances” between data elements consists of implementing a formula which depends upon the relative strengths of the relationships. For example, in a telephone call database, telephone numbers are divided into disjoint clusters. Each telephone number in a cluster only comminicates with numbers from that cluster. If M is the maximum number of communications between two telephone numbers in a cluster, the distance between telephone numbers A and B in that cluster is M/N
AB
where N
AB
is the number of communications between A and B. If A and B do not communicate, the distance between them is not defined.
In defining the metric where the data elements are files on a computer (or in a database of patent applications or law case books) each word (or character string) is assigned a score. Words such as “the” are assigned an extremely low score, such as one point. Words dealing with specific subject matter, such as “pyrotechnic”, will be scored much higher. The word list from each file is compared with the word list from every other file, and the respective scores are tallied. The closer the score between two files, the shorter the “distance” between the files. (In this illustration “words” refer to character strings and are not limited to words in the English language.)
The third part of the inventive method is to map the data elements into the x-y plane such that “distances” between data elements are preserved. This part is the most computationally intensive portion of the inventive method. If the number of data elements is n, then the number of “distances” between data elements is of order n
2
. Thus, there are approximately n
2
/2 conditions to be satisfied, however, there are only 2n degrees of freedom (n x-coordinates and n y-coordinates) available to satisfy those conditions. Since n
2
is typically much larger than 2n, this problem is highly over constrained, and in general, does not have an exact solution. The usual approach to solving such problems is to use a least squares approximation to obtain the “best” solution possible. In this approach, the sum of the squares of the “errors”, at each point is minimized. Least squares problems of size n typically require order n
3
operations to solve the problem on a computer.
Because the inventive method is designed to solve problems involving large amounts of data, the number of data elements, n, would be extremely large. Computing least squares solutions for these problems

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Database display and search method does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Database display and search method, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Database display and search method will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2445819

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.