Methods and apparatus for assessing web page decay

Data processing: database and file management or data structures – Database and file access – Search engines

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

07818312

ABSTRACT:
A signal-bearing medium is disclosed that includes operations including establishing a link threshold, wherein a web page will be assessed as lacking currency if a percentage of hyperlinks contained in the web page that link to an active page is less than the link threshold, accessing a web page containing hyperlinks, and testing the hyperlinks. Testing includes: selecting a hyperlink; and monitoring a number of redirects encountered by following the selected hyperlink until a final web page is reached or a failure occurs, and assessing the selected hyperlink as linking to a dead web page if a redirect limit is exceeded, the redirect limit greater than one, wherein exceeding the redirect limit causes occurrence of a failure. The operations also include calculating a percentage of hyperlinks that return active web pages, and comparing the percentage of hyperlinks that return active web pages with the link threshold.

REFERENCES:
patent: 5860071 (1999-01-01), Ball et al.
patent: 7707229 (2010-04-01), Tiyyagura
patent: 2004/0064471 (2004-04-01), Brown et al.
patent: 2005/0256860 (2005-11-01), Eiron et al.
Bar-Yossef et al., “Sic Transit Telae: Towards an Understanding of the Web's Decay,” May 17-22, 2004, 328-337.
Eiron et al., “Ranking the Web Frontier,” May 17-22, 2004, 309-318.
Michel, “Dead Link Check,” (http://web.archive.org/web/20030806074242/http://dlc.sourceforge.net/dlc-0.4.0.html), Aug. 6, 2003, 1-8.
Hausherr, “Xenu's Link Sleuth” (http://web.archive.org/web/20031003215928/http://home.snafu.de/tilman/xenulink.html), Oct. 3, 2003, 1-11.
Fielding et al., “Hypertext Transfer Protocol—HTTP/1.1” (http://web.archive.org/web/20030811234917/http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html), Aug. 11, 2003, 1-11.
Yahoo Crawler, “Why is your crawler asking for strange URLs that have never existed on my site?” (http://help.yahoo.com/l/au/yahoo7/search/webcrawler/slurp-10.html), Aug. 24, 2007, 1.
W. Aiello, F. Chung, and L. Lu. A random graph model for power law graphs.Experimental Mathematics, 10:53-66, 2001.
Z. Bar-Yossef, A. Berg, S. Chien, J. Fakcharoenphol, and D. Weitz. Approximating aggregate queries about web pages via random walks. InProceedings of the 26th International Conference on Very Large Databases, pp. 535-544, 2000.
A.-L. Barabasi and R. Albert., Emergence of scaling in random networks,Science, 286:509-512, 1999.
K. Bharat, A. Broder, M. Henzinger, P. Kumar, and S. Venkatasubramanian. The connectivity server: Fast access to linkage information on the Web. InProceedings of the 7th International World Wide Web Conference, pp. 104-111, 1998.
K. Bharat and M. Henzinger, Improved algorithms for topic distillation in a hyperlinked environment, InProceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 104-111, 1998.
B. Brewington and G. Cybenko, How dynamic is the web? InProceedings of the Ninth International World Wide Web Conference, pp. 257-276, May 2000.
S. Brin and L. Page, The anatomy of a large-scale hypertextual Web search engine., InProceedings of the 7th International World Wide Web Conference, pp. 107-117, 1998.
A. Z, Broder, S. C. Glassman, M. S. Manasse, and G. Zweig, Syntactic clustering of the Web, InProceedings of the 6th International World Wide Web Conference, pp. 391-404, 1997.
A. Z. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, R. Stata, A. Tomkins, and J. Wiener, Graph structure in the web., WWW9/Computer Networks, 33(1-6):309-320, 2000.
A. Z. Broder, R. Lempel, F. Maghoul, and J. Pedersen, Efficient Pagerank approximation via graph aggregation, Manuscript, May 17, 2004.
S. Chakrabarti, M. van den Berg, and B. Dom, Focused crawling: a new approach to topic-specific web resource discovery,WWW8/Computer Networks, 31(11-16): 1623-1640, 1999.
J. Cho and H. Garcia-Molina. The evolution of the web and implications for an incremental crawler. InProceedings of the 26th International Conference on Very Large Databases, pp. 200-209, 2000.
F. Douglis, A. Feldmann, B. Krishnamurthy, and J. C. Mogul. Rate of change and other metrics: a live study of the world wide web, InUSINEX Symposium on Internet Technologies and Systems, 1997.
B. Edelman. Domains reregistered for distribution of unrelated content: A case study of “Tina's Free Live Webcam”. http://cyber.law.harvard.edu/people/edelman/renewals/, 2002.
D. Fetterly, M. Manasse, M. Najork, and J. L. Wiener. A large-scale study of the evolution of web pages. InProceedings of the 12th International World Wide Web Conference, pp. 669-678, 2003.
R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, and T. Berners-Lee.RFC2616: Hypertext Transfer Protocol—HTTP/1.1, http://www.w3.org/Protocols/rfc2616/rfc2616.html, Jun. 1999.
T. Haveliwala. Topic-sensitive PageRank. InProceedings of the 11th International World Wide Web Conference, pp. 517-526, 2002.
M. Henzinger, A. Heydon, M. Mitzenmacher, and M. Najork. On near-uniform URL sampling.WWW9/Computer Networks, 33(1-6):295-308, 2000.
A. Jesdanun. Internet littered with dead web sites, http://story.news.yahoo.com
ews?tmpl=story&u=/ap/20031102/ap—on—hi—te/deadwood—online—1, Nov. 2002.
J. M. Kleinberg.Authoritative sources in a hyperlinked environment,Journal of the ACM, 46(5):604-632, 1999.
R. Kumar, P. Raghavan, S. Rajagopalan, D. Sivakumar, A. Tomkins, and E. Upfal. Stochastic models for the web graph.In Proceedings of the 41st IEEE Annual Foundations of Computer Science, pp. 57-65, 2000.
A. Ntoulas, J. Cho, and C. Olston. What's new on the web? The evolution of the web from a search engine perspective. InProceedings of the 13th International World Wide Web Conference, 2004.
G. Pandurangan, P. Raghavan, and E. Upfal. Using PageRank to characterize web structure. InComputing and Combinatorics: 8th Annual International Conference, pp. 330-339, 2002.
P. Rusmevichientong, D. M. Pennock, S. Lawrence, and C. L. Giles, Methods for sampling pages uniformly from the world wide web, InProceedings of the AAAI Fall Symposium on Using Uncertainty Within Computation, pp. 121-128, 2001.
J. L. Wolf, M. S. Squillante, P. S. Yu, J. Sethuraman, and L. Ozsen. Optimal crawling strategies for web search engines. InProceedings of the 11th International World Wide Web Conference, pp. 136-147, 2002.
S. Chakrabarti, B. Dom, D. Gibson, R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins, “Spectral filtering for resource discovery”, In Proceedings of the ACM SIGIR Workshop on Hypertext Analysis, 1998, pp. 13-21.
W. Koehler, “Analysis of web page and web site constancy and permanence”, Journal of the American Society for Information Science, 50(2):162-180, 1999.
W. Koehler, “Digital libraries and world wide web sites and page persistence”, Information Research, 4(4), 1999, 18 pgs.
K. Kokoszkiewicz (a.k.a. Alectorides Conradus), “Vocabula Computatralia”, Anglico-Latinum, University of Warsaw, Centre for Studies on the Classical Tradition in Poland and East-Central Europe (OBTA), http://www.obta.uw.edu.pl/˜draco/docs/voccomp.html, Oct. 10, 2004, 8 pgs.
J. Markwell and D. W. Brooks, “Broken links: The ephemeral nature of educational WWW hyperlinks”, Journal of Science Education and Technology, 11(2):105-108, 2002.
J. Markwell and D. W. Brooks, “‘Link rot’ limits the usefulness of web-based educational materials in biochemistry and molecular biology”, Biochemistry and Molecular Biology Education, 31(1):69-72, 2003, 4 pgs.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Methods and apparatus for assessing web page decay does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Methods and apparatus for assessing web page decay, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Methods and apparatus for assessing web page decay will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-4174313

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.