System and method for searching information stored on a network

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C707S793000

Reexamination Certificate

active

06567800

ABSTRACT:

BACKGROUND OF THE INVENTION
The field of the invention is searching, and in particular searching for information stored in a set of websites.
A website (“site”) is defined herein as a collection of files stored on a computer (e.g., a server) that is connected to a network. The World Wide Web (WWW) is a collection of websites whose servers are interconnected through the Internet. A collection of websites can also be stored on servers that are interconnected through a private network, e.g., through an intranet.
In many cases, at least some of the files of a website contain hyperlinks. A hyperlink is typically a text, graphic or image object in a first file that, when selected by a user, either causes a second file to be displayed to the user, causes a different part of the first file to be displayed to the user, or executes a program. In this way, a file in a website can be interrelated with another file stored at the same website, a different website, or elsewhere. The interrelated files of a single website usually reflect a common theme, such as information about a particular company, activity, or service.
The amount of information stored in a collection of websites can be substantial. For example, the WWW includes over 600,000 websites. Conservatively assuming an average data size of 2 Megabytes (MB) per website, the WWW includes over 1200 billion bytes of information across a wide range of topics. Finding a particular piece of information in such a large collection can be problematic. For example, simple browsing through the websites in search of a particular type of information can be impractical in a website collection of substantial size.
One known system addresses the problem of finding particular information stored at websites by categorizing websites according to the topic or topics to which they pertain. One such known system is the Yahoo! search engine located at <http:\\www.yahoo.com>. Yahoo! obtains information about the topic or theme to which a website pertains along with a brief narrative describing the contents of the website (i.e., from the administrator or owner of the website). This information (along with a website identifier) is then correlated with a category. The Yahoo! categories are organized hierarchically, so that a given category typically has one or more subcategories, and each such subcategory has further subcategories, etc.
An example of a Yahoo! interface is shown in FIG.
1
. An example of a category is Arts&Humanities,
101
, which has subcategories Literature
102
and Photography
103
. When a user selects the Literature subcategory
102
, Yahoo! displays the page shown in
FIG. 2
to the user.
FIG. 2
shows numerous subcategories
201
of the Literature subcategory
102
. Hereinafter, the term “category” will be used interchangeably with the term “subcategory.”
Yahoo! also accommodates keyword searching. In
FIG. 2
, a user has entered a search for the keyword “telephone”
202
that is restricted
203
to websites in the Literature category. In this case, the user may be interested in finding literature where the telephone plays a major role. When the search button
204
is selected, only website descriptions, and not website content, that fall under the category “Literature” are searched for the term “telephone.” Website descriptions are generally terse, one line or one paragraph summaries describing the content of the website. A website description cannot fully capture all of the detail contained in the website's content. Indeed, by definition, it is a summary. Because only the descriptions are keyword searched, and not the content, a Yahoo! keyword search can disadvantageously miss relevant content even when the keyword search is limited to website descriptions in a relevant category. Websites whose descriptions contain the term “telephone” are displayed to the user, as shown in FIG.
3
.
As discussed above, because Yahoo! keyword searches only search the descriptions of websites and not their content, a Yahoo! keyword search can miss identifying websites that contain information relevant to the user's request. Thus, for example, many files at different websites in the Literature category may well contain the keyword “telephone.” None of these would be detected and displayed to the user by Yahoo!, even though the user is interested in finding occurrences of “telephone” in websites that fall within the Literature category. In this way, the Yahoo!-type category/descriptive information search is overly narrow, and is prone to miss detecting information that the user would be interested in seeing.
Another known system for searching for information at websites stores and indexes a vast amount of content from numerous websites, but does not correlate website content with categories. Such a known system is the AltaVista™, located at <http://www.altavista.digital.com>. In AltaVista™, a user submits a keyword search.
FIG. 4
shows the AltaVista™ interface in which a user has submitted a keyword search request for the term “AT&T”
401
. In response, AltaVista™ searches its stored content for occurrences of the term “AT&T”, and shows the user the websites that have content in which the term occurs (
402
.) Some excerpted content (e.g.,
403
) is also displayed. It is difficult for the user to efficiently and accurately identify websites that have content of interest to the user.
Just as the Yahoo!-type search can be too narrow, the AltaVista™-type content search can be too broad. For example, the results for the keyword search shown in
FIG. 4
include over 300,000 websites
404
. Even when the results are organized in some prioritized fashion (e.g., websites with the greatest number of occurrences of the keyword term are listed first), such a broad result is too large to be very useful to the user.
Searching by category and then using a keyword search to search the descriptive information about websites within a category can be too narrow, and miss detecting websites that have content that is relevant to the user's request. On the other hand, keyword searching of only the content of websites can be too broad. A way is needed to take advantage of the narrowing effect of a category search and the depth of a content search to yield a more accurate and complete search result.
SUMMARY OF THE INVENTION
In accordance with an embodiment of the present invention, websites are searched for desired information first by narrowing the scope of the search by identifying websites that correspond with a category pertinent to the desired information. Next, a keyword search is carried out on the content (not just the descriptions or summaries of content) of websites that fall within the pertinent category. This is advantageously more efficient than searching all of the content of the universe of websites initially, because such a search often disadvantageously returns too many results, many of which can be irrelevant (e.g., as in Altavista™) Likewise, it provides higher resolution than simply performing a category search, which can fail to identify websites within the category that have the most relevant information. It also provides higher resolution than narrowing the field of websites by category, and then performing a keyword search on website descriptions or content summaries, e.g., as in Yahoo!, which can miss relevant information that is included in the content itself, but not in the description or summary. The present invention advantageously combines the efficiency and accuracy of category and content searching to provide a more efficient, better way of finding the information most relevant to a user's need in a set of websites.


REFERENCES:
patent: 5875446 (1999-02-01), Brown et al.
patent: 5930474 (1999-07-01), Dunworth et al.
patent: 6038560 (2000-03-01), Wical
patent: 6070158 (2000-05-01), Kirsch et al.
patent: 6078866 (2000-06-01), Buck et al.
patent: 6112203 (2000-08-01), Bharat et al.
patent: 6308202 (2001-10-01), Cohn et al.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and method for searching information stored on a network does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System and method for searching information stored on a network, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for searching information stored on a network will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3040534

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.