Method and system for importing database information

Computer graphics processing and selective visual display system – Display driving control circuitry – Controlling the condition of display elements

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000

Reexamination Certificate

active

06424358

ABSTRACT:

FIELD OF THE INVENTION
The present invention generally relates to a method and system for importing database information into a querying system.
BACKGROUND OF THE INVENTION
Numerous independently owned collections of data are being created and maintained all over the world. The number of active and legacy data sources almost guarantees that part or all of a query can be answered using one of these countless databases. However, there exist several intermediate steps between posing a query and receiving an answer that make the task of querying others' databases almost impossible for the average user. First, the user must locate a relevant data source. Then he or she must gain access to the source, pose the query using table names and attribute names from that target database, and finally must decide which of the returned data, if any, is relevant to the query. User's queries must be formatted correctly, either using structured query language (SQL) code, or using formatted blocks of code (i.e., code generated by a back-end process based on user-filled selection boxes and text fields).
While this list of steps is formidable, the process of querying is even more difficult if multiple databases must be consulted to obtain a complete answer. Not only must the above steps be executed, but the data from different sources must be joined; and, if there are discrepancies, the user must decide which source is more reliable. In integrating the data, users must first understand elements of each database's schema so that corresponding fields between databases can be identified. Even once corresponding fields have been located, user must consider both the relative accuracy of the sources and the timeliness of the data contained within the sources. For example, data in a five (5) year old database would obviously be less relevant to data in a current database if a Department of Defense (DoD) member is querying about current troop movements.
There are even more basic problems standing between the user's query and an answering data set. Databases are created with a particular task in mind. The database may be tailored for ease of asking particular types of queries, for ease of storing new data, or for storing groups of attributes as an object. Designing databases for specific purposes allows data to be stored and retrieved efficiently for that particular task and possibly a few related tasks. However, this makes it nearly impossible to retrieve information for other unrelated tasks. In looking at the task of querying from this perspective, it can be seen that the most fundamental querying problem is that groupings of objects that make sense in one database representation, make it difficult to regroup attributes to form objects meaningful to a query unrelated to the database's specific purpose. For example, consider the database tables below which have been excerpted from a hypothetical company's relational database:
Employee
Employee_ID
Social_Sec_#
Salary
Title
Acquisition Agent
Occupation_Code
Salary_Band_A
Salary_Band_B
Band_A_Max_PO
Tables and attributes from a hypothetical company database. The “Employee” table has key Employee_ID and attributes Social_Sec_#, Salary, and Title. The “Acquisition Agent” table has key Occupation Code and attributes Salary_Band_A, Salary_Band_B,and Band_A_Max_PO.
A division of this hypothetical company has a database that keeps track of its employees. The database has a table, “Employee, ” that contains basic information such as name, social security number, salary, and job title. The key in this table is Employee_ID. The database also has individual tables relating to each job title within the company. These tables note the occupation's salary ranges (e.g., Salary_Band_A) and the specific duties at each salary level (e.g., Band_A_Max_PO). For example, for an “Acquisition_Agent, ” the salary bands are A, B, etc., and the maximum amount that an individual in salary band A may purchase is Band_A_Max_PO. This table's key is Occupation_Code. A reasonable query from another division of the company could be “Return the individuals who can purchase more than 5000 units of product X.” Given the above two tables from the database, we can see that the query will be difficult to execute. First, the individual asking the query would have to know that Acquisition_Agent and Buyer were synonymous. Next, a join on salary would need to be executed, but there is no common key. Finally, math would have to be performed to translate between the maximum purchase order allowed (Band_A_Max_PO) and the number of units of X a specific buyer could purchase. This seemingly simple query requires a great deal of database-specific knowledge.
From the above discussion it is clear that there can be a number of issues encountered in trying to retrieve data from an unfamiliar source or sources. There is the initial task of locating relevant data sources. Even once this has been accomplished, the problem of answering the query becomes no easier. Issues range from the banal, but nontrivial, task of gaining access privileges, to the more theoretical and complex tasks of regrouping of attributes to form real-world entities (i.e., the attributes within a table must be understood as representations of actual physical objects). Several potential obstacles are discussed below.
The first potential obstacle concerns gaining access to the relevant data source. This involves being allowed to read the database schema and the data contained within the database. Additionally, it may require the ability to store intermediate tables. When a large, multi-step query with several joins or cross products is carried out, the intermediate tables generated need to be temporarily stored. If systems accessing the database are remote, it is clearly impractical to transmit these larger data sets to the querying machine. Thus, some local write space may be desired.
A second potential obstacle concerns the fact that each database in the system may have been designed for efficiency for a system-specific task. Databases are created to fit within larger systems. These systems have certain storage and retrieval requirements, as well as baseline assumptions about data format. No matter how general a database schema is developed, the schema must operate within the system and data requirements. This necessarily means there are queries the system will have difficulty answering.
A third potential problem is that poorly labeled tables and attributes can make it impossible to determine the real-world object being represented. Examples of table names extracted from actual DoD data sources include: $UD01, VNNZ, SYFA, and WUC1. Examples of attribute names extracted from the same DoD source include: SC, TCN, FROM_PPC1, and PRIME. Without the aid of documentation or the original database designers, it is impossible to know what physical objects are represented by these tables. Thus, data corresponding to a user's query is forever lost because a user or an automated system will be unable to identify all relevant data.
The fourth potential problem in trying answer a query is that documentation is typically scarce and may not be any less cryptic than the database objects themselves. Additionally, original database designers may have forgotten what the objects represent, or they may have moved onto other sites. Users are left to map between database schema and real-world objects to the best of their ability.
If the average user is able to overcome these obstacles and retrieve data from several data sources, he must then combine the responses into a coherent solution set This compilation may involve conflict resolution among data rows. In some situations, it may be acceptable to return both data items and allow the user to decide which data item is more reliable. Consider however a fictitious military example. Two different databases return different locations for the same enemy tank. One location is very close to a U.S. Army base, and the other set of coordinates places the tank much farther

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and system for importing database information does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and system for importing database information, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for importing database information will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2818670

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.