Answering queries using query signatures and signatures of...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C707S793000, C707S793000

Reexamination Certificate

active

06347314

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to techniques for obtaining answers to queries, where the answers include items of data that have been cached for semantic regions.
BACKGROUND
Tavakoli, N., and Ray, A., “A New Signature Approach for Retrieval of Documents from Free-Text Databases”,
Information Processing & Management
, Vol. 28, No. 2, 1992, pp. 153-163, describes a signature approach to information retrieval in which the sizes of signature files are dependent on the number of unique symbols in the alphabet. A signature maintains the positional information of characters and allows for Don't Care Characters to be used in queries.
In client-server information systems, local client memory is largely used to cache data and to minimize future interaction with servers. This data caching is particularly important on the World Wide Web (“the Web” or “WWW”) and on similar network-based information retrieval systems in which network traffic and slow servers can lead to long delays in delivery of information. As standard page-based caching is technically impractical on the Web and tuple-based caching has certain limitations, much effort has been spent to cache user queries and answers for possible future reuse. Such techniques are described, for example, in Godfrey, P., and Gryz, J., “Semantic Query Caching For Heterogeneous Databases”, in
Proceedings of the
4
th KRDB Workshop, “Intelligent Access to Heterogeneous Information
”, Athens, Greece, 1997, pp. 6-1 to 6-6.
Query caching is particularly advantageous when the user refines a query quite often, for example, by adding or removing a keyword. In this case, many answer tuples may already be cached and can be delivered to the user right away.
A Web-based query system can contact heterogeneous distributed data repositories or servers, can invoke so-called wrappers to convert user queries into a target query language appropriate for the servers, and can control the data flow from the servers. As data are usually transferred over the network in HTML format, the wrappers can also extract answer tuples from the retrieved HTML files before the final answer is reported to the user and stored in the cache.
A typical Web query is a conjunction of terms. Each term in the query can be a keyword, possibly negated with the operator NOT, and can be applied to one or more attributes (title, author, etc.). In most Web servers, the operator NOT is equivalent to AND NOT to force a query to contain at least one non-negated term.
Semantic caching can manage a client cache as a collection of semantic regions; access information is managed and cache replacement is performed at the unit of semantic regions. Such a technique is described in Dar, S., Franklin, M. J., Jonsson, B. T., Srivastava, D., and Tan, M., “Semantic Data Caching and Replacement”, in
Proceedings of the
22
nd
VLDB Conference, Mumbai
(
Bombay
),
India
, 1996, pp. 330-341. Semantic regions group together semantically related documents covered, for example, by a user query.
In semantic caching, each semantic region can have a constraint formula which describes its contents, a counter of tuples in the contents, a pointer to the set of actual tuples in the cache, and the additional information that is used by the replacement policy to rank the regions.
When a query is posed at a client, it can be split into two disjoint pieces: (1) the portion of the answer available in the local cache, and (2) a remainder query, which retrieves any missing tuples in the answer from the server. If the remainder query is not null (i.e., the query covers parts of the information space that are not cached), the remainder query is sent to the server and processed there.
SUMMARY OF THE INVENTION
The invention addresses problems that arise in using a cache that includes items of data for semantic regions.
Although a number of important principles of query caching have been proposed, conventional query caching techniques do not provide efficient methods for evaluating or comparing a query against cached items in obtaining an answer to the query. Moreover, query evaluation is reduced in some techniques to a Datalog query evaluation, which may be computationally hard.
The present invention provides a technique that alleviates problems in evaluating or comparing a query against cached items. The technique can be implemented in a system that includes memory with a set of cache locations. The cache includes, for each of a set of one or more semantic regions, contents with one or more items of data. The technique responds to a query by obtaining a query signature and by using the query signature and region signatures for at least one of the semantic regions in finding one or more semantic regions that are qualified and using the contents of at least one of the qualified regions to obtaining an answer to the query.
Signatures simplify operations that evaluate or compare a query with semantic regions to find regions that are qualified. As a result, the technique can be implemented to obtain simple and efficient query evaluation or comparison, making it possible to perform query evaluation in time that is not greater than linear with the number of cached items of data and may be less than linear.
The technique can be implemented using a system that caches Web queries. Each semantic region's entry in the cache can include a signature for the region.
For a query or region formula that includes a conjunction of terms, the signature can be obtained by combining signatures for the terms. Each signature can be a binary string, and all can have the same length, so that they can be combined by logical operations.
A query signature can be compared to region signatures stored in the cache. If the comparison indicates that the query is equivalent to or contained within one of the regions, an answer can be obtained from cached items of data. In other cases, a criterion can be applied to find cached items of data that are qualified for the query, i.e., cached items of data that can be re-used immediately in obtaining a partial answer; missing information can be requested by sending a query remainder to a server. The criterion can be a region containment criterion or a criterion that is only met by regions that are likely to have a one-term difference from the query.
The technique can also be implemented in a system that includes a memory with a cache defined as described above and a processor connected for accessing the memory. The processor can obtain the query, use the query to obtain a query signature, use the query signature and at least one region signature to find one or more semantic regions that are qualified, and use the contents of at least one of the qualified regions to obtain an answer to the query, as described above. The system can also include one or more user interface devices through which the processor can provide signals to and receive signals from a user. The processor can obtain the query by receiving signals indicating the query from a user through one of the user interface devices and can also provide the answer to the query to the user through one of the user interface devices. The system can have a connection to the internet, and the query can be a Web query.
The technique can also be implemented in an article of manufacture that includes a storage medium and instruction data stored on the storage medium indicating a sequence of instructions. When the instruction data are accessed by a storage medium access device in a system, the system's processor, in executing the instructions, can obtain the query, use query to obtain a query signature, use the query signature and at least one region signature to find one or more semantic regions that are qualified, and use the contents of at least one of the qualified regions to obtain an answer to the query, as described above.
The technique has three main advantages. First, it makes it possible to process both of the following cases in the same elegant way—1) when a query is contained in the cache, or 2) when it intersects some regio

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Answering queries using query signatures and signatures of... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Answering queries using query signatures and signatures of..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Answering queries using query signatures and signatures of... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2935201

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.