Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
2000-09-27
2003-05-20
Metjahic, Safet (Department: 2171)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000
Reexamination Certificate
active
06567812
ABSTRACT:
BACKGROUND OF THE INVENTION
This invention relates to automated search of heterogeneous data sources for desired information, and to the management of the information retrieved during the search.
Data and information are different, but inseparably intertwined. To understand a difference between data and information for the purposes of the following discussion, a simple example will be provided.
The financial page of a newspaper may be thought of as providing data. In particular, a newspaper's financial page may provide, for each of a plurality of stocks in a given market, a closing price and an indicator of the difference between the current closing price and an immediately preceding closing price. To a person attempting to discover the closing price of a particular one of the plurality of stocks, the financial page of the newspaper provides information. To a person attempting to discover whether the given market, as a whole, advanced or declined, the financial page of the newspaper provides data which might be aggregated and analyzed to find the direction of the market.
Taking the example further, the information as to whether the market advanced or declined as a whole may be thought of as data by a person attempting to determine whether there is a cycle of market advances or declines over a period of years. In turn, whether or not a given market advances or declines in a cyclic manner over a period of years may be merely data to yet another person attempting to discover whether there is any kind of link between cyclic markets and something more abstract, such as the number of representatives of a conservative political party elected subsequent to such cycles.
Data, in the abstract, may thus generally be thought of as being at a lower level than information. Whether an item more correctly qualifies as data or information is naturally dependent upon the point of view and on the discovery needs of a person. Because of the dynamic nature of the problems that confront people each day, the terms data and information are often interchangeably used.
A database may be understood to be a collection of data and information stored on a computer system. For the purposes of this discussion, such a general definition will generally be appropriate. Database management systems provide a level of independence between the raw data and programs that might be used to retrieve the data. Data can be retrieved from databases managed by database management systems by issuing appropriate function or procedure calls containing terms in a query definition language. The database management system response to the terms in a query definition language, such as SQL, by retrieving and returning the stored data that meets the parameters contained in the query.
Depending on the purpose of the system, different databases may have different levels of usefulness for those seeking to gain information from them. To explain, some database management systems are used to coordinate and to control information for the purpose of supporting online transaction processing (OLTP).
OLTP applications are characterized by many users creating, updating, or retrieving individual records, and so OLTP databases are optimized for transaction updating. The data is stored in a manner that is very useful for handling transactions, but a form that is much less useful for supporting analysis of the data.
One way to make this data more useful for high-level analysis is to reformat and to aggregate the data in a database specifically arranged for online analysis processing (OLAP). OLAP applications may be used by analysts and managers seeking a higher-level aggregated view of the data, such as total sales by product line, by region, and so forth. An OLAP database may be updated in batch, from multiple sources, and can provide a powerful analytical back-end to multiple user applications. OLAP databases are thus optimized for analysis.
It should be apparent, however, that such databases are in a highly structured format, and there is required intimate knowledge of the structured format to access the data to perform the appropriate analysis.
Not all data is managed by database management systems, and not all data in databases is highly structured. Some data in databases is stored in association with one or more indices. The data in the database is retrieved with reference to the one or more indices.
Oftentimes, the term “database” evokes a sense of structure in the data. However, for the purposes of this discussion, not all databases are structured databases. In particular, a collection of text documents may be thought of as being a database. Much of the institutional knowledge of an organization may be contained in the documents of the organization and not in the structured databases managed by the organization's database management systems. All of this organizational knowledge stored in text documents, while formerly unavailable for search, now is becoming useful as data with the advent of appropriate searching and querying tools.
For example, an organization may have a document management system that coordinates workflow with respect to documents, but also provides an index that can be used to find and retrieve documents across the organization. Likewise, an organization may use a text search processor to access a central database of text documents to find certain documents meeting the parameters of a text search query.
Web pages on the World Wide Web are typically text documents. A collection of such text documents may be thought of as an unstructured database. Thus, for the remainder of this discussion, a structured database will be understood to be one that has a definite structure and, typically, is controlled by a database management system. Likewise, an unstructured database will be understood to be one that is not controlled by a database management system and, typically, is a collection of text documents.
One of the biggest reasons for the importance of the World Wide Web is an advent of tools that make it possible to find and access the text documents that make up the Web pages of the World Wide Web. A brief look at some of the tools available to find information on the World Wide Web will now be undertaken.
A search engine may be thought of as a search database coupled with the tools to generate and search the search database. A search engine may be owned by a Web location service. A Web location service may be thought of as a Web site or a company that provides a way to find and locate Web pages having data that meets the information needs or discovery needs of a user.
Yahoo! is an example of a Web location service. Yahoo! attempts to provide a complete front end for the Internet by providing news, libraries, dictionaries, and other sources in addition to a search engine. Yahoo! emphasizes cataloging—a classification of identified pages into a hierarchical structure. Alta Vista and Excite are Web location services that emphasize providing the most comprehensive search database.
Some Web location services use the search engine technology of other companies, such as Inktomi, to provide a useful location service for Web pages and files while concentrating on providing other, additional services.
Every search engine may be thought of as providing three important elements. These elements include information discovery and search database components, a user search component, and a presentation component.
In particular, the information discovery and search database components of a search engine may obtain information by accepting information sent by persons hoping to gain greater exposure for their Web pages or by gathering the information using software programs designed to locate Web pages, and to store information about the pages and their location. Such software programs may be called Web crawlers, spiders, or robots. For convenience, such software programs may be herein referred to as robots, collectively.
When a robot identifies a new page, the robot may simply store the title of the page and the univers
Garrecht Thomas
Loritz Axel
Weiss Anton
Metjahic Safet
Nguyen Cindy
Siemens Aktiengesellschaft
LandOfFree
Management of query result complexity using weighted... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Management of query result complexity using weighted..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Management of query result complexity using weighted... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3072952