User-initiated maintenance of document locators

Electrical computers and digital processing systems: multicomput – Distributed data processing – Client/server

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C709S219000, C709S223000, C707S793000, C707S793000

Reexamination Certificate

active

06529939

ABSTRACT:

TECHNICAL FIELD
This invention relates to Internet technology, and more particularly to the maintenance of a repository of summary data containing document locators such as uniform resource locators (URLs) for example one associated with a search engine.
BACKGROUND OF THE INVENTION
To conduct a search on the Internet, a user typically queries a search engine (such as Yahoo, Hotbot, etc) to find a desired piece of information, which is contained in a document stored on a web site. The search engine typically does not keep copies of documents, but instead keeps an indexed repository of summary data (also called metadata) containing links (also called hyperlinks, or uniform resource locators (URLs)) to the documents. The summary data is generated by using a gatherer to “crawl” a web site and analyze its content.
When the user queries a search engine, the search engine returns a list of results matching the query terms. Each result contains the URL for a document as well as an abstract. The user then clicks on the URL to go to the document. When the search engine repository is out of date, the URLs presented in the results may not be valid, and in such a case an error code is returned. There are several reasons why URLs may not be valid, for example the document may no longer exist or it may have been moved to another location. This problem of an invalid URL is often known as a “broken link.”
Broken links are frustrating to the user, as it often takes a significant amount of time for the error message to be returned. If a user is frustrated enough times, he or she will likely become dissatisfied with the search engine and use a different one. Thus the quality of a search engine can be measured by how up to date it is, or put another way, what percentage of broken links it has in its repository.
Web sites change at a rapid pace. Because the web sites have no control over where the summary data of their documents is stored, there is no way of notifying anyone about any changes. Thus search engines currently maintain their repositories by periodically recrawling their resources (i.e. the web sites which they have summarized). One way to maintain a repository by recrawling is described in U.S. Pat. No. 5,855,020 to Kirsch. Kirsch monitors dynamic network feed for new URLs and validates them prior to adding to a repository. He also revalidates the repository by periodically assessing the validity of each URL, with the period determined by an associated volatility for each URL. This recrawling is costly in terms of time and resources, and thus cannot be performed often enough to keep up with the rapid pace of change.
Thus it is desirable to have a way to reduce or avoid URL maintenance recrawling by dynamically detecting and eliminating broken links in order to maintain a search engine (or other Internet) data repository.
SUMMARY OF THE INVENTION
A method of maintaining a repository of summary data about documents associated with document locators, the repository of summary data stored separately from the documents and containing the document locators, when a user requests a document associated with one of the document locators, by requesting the document associated with the document locator; receiving a result based on the request; analyzing the result to determine the validity of the document locator; and requesting an update of the repository based on the validity of the document locator is described. In a network environment, the document locators are uniform resource locators, or URLs. The repository update may take the form of deleting the URL from the repository, or moving it to another location for further examination.


REFERENCES:
patent: 5649186 (1997-07-01), Ferguson
patent: 5727158 (1998-03-01), Bouziane et al.
patent: 5751956 (1998-05-01), Kirsch
patent: 5774664 (1998-06-01), Hidary et al.
patent: 5778368 (1998-07-01), Hogan et al.
patent: 5787424 (1998-07-01), Hill et al.
patent: 5802518 (1998-09-01), Karaev et al.
patent: 5809317 (1998-09-01), Kogan et al.
patent: 5812769 (1998-09-01), Graber et al.
patent: 5855015 (1998-12-01), Shoham
patent: 5855020 (1998-12-01), Kirsch
patent: 5860071 (1999-01-01), Ball et al.
patent: 5864863 (1999-01-01), Burrows
patent: 5870546 (1999-02-01), Kirsch
patent: 5892909 (1999-04-01), Grasso et al.
patent: 6070158 (2000-05-01), Kirsch et al.
patent: 6226648 (2001-05-01), Appleman et al.
patent: 6321251 (2001-11-01), Deisinger et al.
International Business Machines Corporation, “Virtual URLs for Browsing and Searching Large Information Spaces”, Research Disclosure, Sep. 1998, pp. 1238-1239.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

User-initiated maintenance of document locators does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with User-initiated maintenance of document locators, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and User-initiated maintenance of document locators will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3052079

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.