Estimating the usefulness of an item in a collection of...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Estimating the usefulness of an item in a collection of... Estimating the usefulness of an item in a collection of...

: 2000-06-02
: 2003-10-28
: Metjahic, Safet (Department: 2171)
: Data processing: database and file management or data structures
: Database design
: Data structure types

: C707S793000
: Reexamination Certificate
: active
: 06640218
: ABSTRACT:

TECHNICAL FIELD
The invention relates to estimating the usefulness of an item in a collection of information.
BACKGROUND
One context in which selection of items from a collection of information (e.g., a database) is useful is a “search engine.” A typical search engine takes an alphanumeric query from a user (a “search string”) and returns to the user a list of one or more items from the database that satisfy some or all of the criteria specified in the query.
Although search engines have been in use for many years, for example in connection with the Westlaw® or Lexis® & legal databases, their use has risen dramatically with the development of the World Wide Web. Because the World Wide Web comprises a very large number of discrete items, which come from heterogeneous sources, and which are not necessarily known in advance to the user, search engines that can identify relevant Web-based information resources in response to a user query have become important tools for doing Web-based research.
With tens or hundreds of millions of individual items potentially accessible over the Web, it is not unusual for a single query to a search engine to result in the return of hundreds or thousands of items of varying quality from which the user must manually select those that may be truly useful. This manual evaluation can be a time consuming and frustrating process.
One approach to managing the large number of potentially relevant items returned by a search engine is for the engine to rank the items for relevance before displaying them. Specifically, the items may be ranked according to some relevance metric reflecting how well the intrinsic features of a particular item (e.g., its textual content, its location, the language in which it is written, the date of its creation, etc.) match the search criteria for the particular search. A number of relevance metrics are described, e.g., in Manning and Schuitze, “Foundations of Statistical Natural Language Processing”, MIT Press, Cambridge, Mass. (1999) pp. 529-574 and U.S. Pat. No. 6,012,053.
Ranking items based on a measure of relevance to a search query is, however, often an imperfect measure of the actual relative usefulness of those items to users. In particular, a relevance metric may not take into account certain factors that go into a user's ultimate evaluation of the usefulness of a particular item: e.g., how well the item is written or designed, the reliability or authority of the source of the information in the item, or the user's prior familiarity with the item. Thus, a search engine presented with a query for “History of the United States” might consider an encyclopedia article by a well-known historian and a term paper written by a high school student to be of equal relevance, even though the former is far more likely to be useful to most users than the latter.
Ranking items by relevance is also susceptible to “spoofing.” “Spoofing” refers to attempting to artificially improve the apparent relevance of a particular item with respect to particular search criteria by altering the item in a misleading way. For example, it is common for search engines to evaluate the relevance of a Web page based on the contents of meta-tags. Meta-tags are lists of keywords that are included in the HTML source of a Web page but which are not normally displayed to users by Web browsers. Web site operators who wish to increase the number of visits to their Web sites commonly attempt to spoof search engines by creating meta-tags that contain either keywords that are not truly indicative of the displayed contents of the page (but which are likely to match a large number of queries), or by creating meta-tags that include multiple instances of arguably appropriate keywords (thus inflating the relative importance of that keyword to the Web page).
Some search techniques have attempted to incorporate information about subjective user preferences within a relevance metric. One such method entails modifying the relevance score of an individual item (with respect to a search term or phrase) according to how often the item is selected when displayed in response to a query containing the search term or phrase. However, this technique may provide unsatisfactory results under conditions of sparse data (i.e., where the individual items were selected by users in response to queries containing the search term or phrase a relatively small number of times).
SUMMARY
The present invention provides a system and method for estimating the usefulness of an item in a collection of information.
In general, in one aspect, a first measure of the usefulness of the item with respect to the first set of criteria is determined. A measure of the quality of the item is determined. A second measure of the usefulness of the item is determined based on the first measure of usefulness and the measure of quality.
Embodiments of the invention may have one or more of the following features.
A measure of the relevance of the item to the first set of criteria is determined. A selection rate of the item is predicted based on the measure of relevance.
Opportunities for user selection of the item are provided. The actual overall popularity of the item is determined. The overall popularity of the item is predicted. The measure of quality of the item is determined based on the actual popularity of the item and the predicted overall popularity of the item.
A plurality of sets of items containing the item is displayed. A choice of the item from at least one of the sets of displayed items is received from a user.
At least one set of items containing the item is displayed ranked in accordance to a relevance metric.
At least one set of items containing the item is displayed ranked in accordance to a measure of the usefulness of the respective items.
Users are provided with opportunities to present sets of criteria. Respective measures of the relevance of the item to respective sets of criteria presented by users are determined. Respective selection rates of the item are predicted based on the respective measures of relevance. The overall popularity of the item is predicted based on the respective predicted selection rates.
Respective measures of the popularity of the respective sets of criteria among users are determined. The overall popularity of the item is predicted based on the respective predicted selection rates and the respective measures of the popularity of the respective sets of criteria.
The rank of the item in a list of items relevant to the set of criteria and ranked according to a relevance metric is determined.
The number of times that the item was selected by a user during a pre-determined period of time is determined.
The collection of information comprises a catalog of information resources available on a public network.
The collection of information comprises a catalog of information available on the World Wide Web.
Data concerning selection of the item by users is collected. An anti-spoof criterion is applied to the data. The actual overall popularity of the item is decreased based on the results of applying the anti-spoof criterion to the data.
Respective first measures of the usefulness of respective other items in the collection of information with respect to the first set of criteria are determined. Respective measures of the quality of the respective other items are determined. Respective second measures of usefulness of the respective other items are determined based on the respective first measures of usefulness and the respective measures of quality. The item and the other items are displayed ranked according to the respective second measures of usefulness.
Items from a collection of information are displayed ranked according to a relevance metric that is different from the second measures of usefulness.
The items displayed according to the relevance metric are from a different collection of information than the items displayed according to respective second measures of usefulness.
The first set of criteria is based on a search criterion received from a user.
The first

Affiliated with

Beeferman Douglas H.

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Golding Andrew R.

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Fish & Richardson P.C.

Law Firm

[ 0.00 ] – not rated yet Voters 0 Comments 0

Lycos, Inc.

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

Metjahic Safet

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Nguyen Cam-Linh

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Estimating the usefulness of an item in a collection of... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Estimating the usefulness of an item in a collection of..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Estimating the usefulness of an item in a collection of... will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-3120775

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure