Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
1999-12-01
2001-09-11
Choules, Jack (Department: 2777)
Data processing: database and file management or data structures
Database design
Data structure types
Reexamination Certificate
active
06289338
ABSTRACT:
BACKGROUND OF THE INVENTION
This invention relates in general to computer database systems and more specifically to a computer database system using an ontology structure to allow analysis of the database.
The proliferation of computer systems and improvements in telecommunications makes an overwhelming amount of data available to a computer user. Massive networks such as the Internet provide millions upon millions of data items in the form of words, numbers, images, etc., in very diverse and unregulated forts. Other, smaller, databases and database systems, such as intranets and stand-alone computer systems, are more restrictive in their data formats yet still provide large volumes of data to the user. Perhaps the smallest application of a computerized database is with today's so-called personal digital assistance (PDAs) which may contain an individual's address book, calendar, or similar personal database.
Within the range of all of these database systems lie the same basic problems of efficient access to, and analysis of, the data. Typical database applications are designed primarily to provide ease of data entry, upkeep and retrieval. However, the applications require that the database be specifically designed for a target application, e.g., medical record-keeping, so that “records,” “templates,” or similar structures must be designed by a database programmer or architect in order for the database application to be useful to an end user. For example, the Access database program, manufactured by Microsoft Inc., requires the creation of records having multiple fields. Information of predetermined types is entered into the fields. The information is accessed using the same predetermined fields.
Such a database system provides a query language so that a user can form relational inquires into the database. In a medical records database, for example, a user can retrieve records from the database that include a specific patient's name AND were created AFTER a certain date. The “AND” operator is a relational operator between the two desired attributes of “patient name” and “creation date.” The AFTER operator modifies the query range by using the creation date.
More recently, popular database search engines have been created which allow users to search larger, less-structured databases such as the Internet with similar relational query operators. For example, search engines by Yahoo! and AltaVista allow relational queries using keywords that do not relate to specific attributes and fields. Instead, any document having words with a specified relationship, such as the above relationship used in the example, is listed as a possible document of interest to the user. While relational database queries are useful in searching for information, they require that the user know with high specificity the type of information sought.
Another use for databases is to provide a platform for analyzing data to determine characteristics, trends or predictive guidelines in the information. For example, where financial data is being analyzed it may be useful to discover that where inflation is high in an overseas market, bond prices in a different market are also high with a very high frequency. Or, in a medical research application, it would be useful to determine that in a high percentage of cases where a certain treatment was used the recovery time was very short. However, such analysis of data is not possible with traditional database applications which singularly focus on retrieving existing data and entering and maintaining data of predetermined formats using relational queries.
Thus, it is desirable to provide a technique and system for analyzing characteristics of data in the matter discussed above. Further, it is desirable to provide such a technique and system that is usable with databases regardless of the size or level of structuring of the database. Also, given the vast amount of data available, it is vital that the results of the analysis be presented in a form that is efficient for detecting trends, qualities or other useful information among the data being analyzed.
SUMMARY OF THE INVENTION
The present invention provides a method and system for efficiently analyzing databases. In one embodiment, the invention is used to analyze data represented in the form of attribute-value (a-v) pairs. A primary step in building the ontology is to identify parent, child and related a-v pairs of each given a-v pair in the database. A parent is an a-v pair that is always present whenever a given a-v pair is present. A child is an a-v pair that is never present unless the given a-v pair is present. Related pairs of a given a-v pair are those a-v pairs present some of the time when a given a-v pair is present.
The system calculates relationships between a-v pairs to produce tables of a-v pairs presented according to the relationships. The user performs additional analysis by investigating the a-v pair relationships through a graphical user interface. Additional visualizations of the data are possible such as through Venn diagrams and animations. Plain-text data documents collected, for example, from the Internet can be analyzed. In this case, the system pre-processes the text data to build a-v pairs based on sentence syntax.
One embodiment of the invention provides a method for analyzing a database where the database includes a plurality of records having a-v pairs. The method executes on a computer system and includes the following steps: determining two or more parents of a given a-v pair where a parent of an a-v pair is another a-v pair that exists within every record that the given a-v pair exists; and displaying the two or more parents on the display screen along with an indication that two or more parents are associated with a given a-v pair.
REFERENCES:
patent: 5724573 (1998-03-01), Agrawal et al.
patent: 5802254 (1998-09-01), Satou et al.
Ranade et al, DB2 Concepts, Programs, and Design, McGraw-Hill, p. 63-94, 1990.*
Nicolaisen, “WizRule may be the key to avoiding database disasters”, Computer Shopper vol. 15, No. 11 pp. 588(3), Nov. 1995.*
Ziarko et al. Discovering Attribute Relationships, Dependencies and Rules by Using Rough Sets, Procd of the 28th Hawaii Intnat.Conf. on System Sciences, pp 293-299, 1995.*
Han et al. “Data Driven Discovery of Quantitative Rules in Relational”,IEEE trans. on Knowledge and Data Engineering. pp29-40, vol.5 No. 1, Feb. 1993.
Stoffel Killian
Wood Robert L.
Choules Jack
Kulas Charles J.
Manning & Napier Information Services
Townsend and Townsend / and Crew LLP
LandOfFree
Database analysis using a probabilistic ontology does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Database analysis using a probabilistic ontology, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Database analysis using a probabilistic ontology will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2536754