Minimizing visibility of stale content in web searching...

Data processing: database and file management or data structures – Database and file access – Search engines

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S706000

Reexamination Certificate

active

07987172

ABSTRACT:
A method and system is disclosed for associating an appropriate web crawl interval with a document so that the probability of the document's stale content being used by a search engine is below an acceptable level when the search engine crawls the document at its associated web crawl interval. The web crawl interval of a document is determined through an iterative process and updated dynamically by the search engine after every visit to the document by a web crawler. A multi-tier data structure is employed for managing the web crawl order of billions of documents on the Internet. The search engine may move a document from one tier to another if its web crawl interval is changed significantly.

REFERENCES:
patent: 4312009 (1982-01-01), Lange
patent: 5521140 (1996-05-01), Matsuda et al.
patent: 5594480 (1997-01-01), Sato et al.
patent: 5634062 (1997-05-01), Shimizu et al.
patent: 5801702 (1998-09-01), Dolan et al.
patent: 5832494 (1998-11-01), Egger et al.
patent: 5898836 (1999-04-01), Freivald et al.
patent: 6003060 (1999-12-01), Aznar et al.
patent: 6012087 (2000-01-01), Freivald et al.
patent: 6049804 (2000-04-01), Burgess et al.
patent: 6068363 (2000-05-01), Saito
patent: 6189019 (2001-02-01), Blumer et al.
patent: 6219818 (2001-04-01), Freivald et al.
patent: 6243091 (2001-06-01), Berstis
patent: 6263350 (2001-07-01), Wollrath et al.
patent: 6263364 (2001-07-01), Najork et al.
patent: 6269370 (2001-07-01), Kirsch
patent: 6285999 (2001-09-01), Page
patent: 6321265 (2001-11-01), Najork et al.
patent: 6336123 (2002-01-01), Inoue et al.
patent: 6351755 (2002-02-01), Najork et al.
patent: 6377984 (2002-04-01), Najork et al.
patent: 6404446 (2002-06-01), Bates et al.
patent: 6418433 (2002-07-01), Chakrabarti et al.
patent: 6418453 (2002-07-01), Kraft et al.
patent: 6424966 (2002-07-01), Meyerzon et al.
patent: 6547829 (2003-04-01), Meyerzon et al.
patent: 6594662 (2003-07-01), Sieffert et al.
patent: 6631369 (2003-10-01), Meyerzon et al.
patent: 6638314 (2003-10-01), Meyerzon et al.
patent: 6701350 (2004-03-01), Mitchell
patent: 6751612 (2004-06-01), Schuetze et al.
patent: 6763362 (2004-07-01), McKeeth
patent: 6772203 (2004-08-01), Feiertag et al.
patent: 6950874 (2005-09-01), Chang et al.
patent: 6952730 (2005-10-01), Najork et al.
patent: 7047491 (2006-05-01), Schubert et al.
patent: 7080073 (2006-07-01), Jiang et al.
patent: 7139747 (2006-11-01), Najork
patent: 7171619 (2007-01-01), Bianco
patent: 7200592 (2007-04-01), Goodwin et al.
patent: 7231606 (2007-06-01), Miller et al.
patent: 7260543 (2007-08-01), Saulpaugh et al.
patent: 7299219 (2007-11-01), Green et al.
patent: 7308643 (2007-12-01), Zhu et al.
patent: 7310632 (2007-12-01), Meek et al.
patent: 7346839 (2008-03-01), Acharya et al.
patent: 7725452 (2010-05-01), Randall
patent: 2002/0010682 (2002-01-01), Johnson
patent: 2002/0052928 (2002-05-01), Stern et al.
patent: 2002/0065827 (2002-05-01), Christie et al.
patent: 2002/0073188 (2002-06-01), Rawson, III
patent: 2002/0099602 (2002-07-01), Moskowitz et al.
patent: 2002/0129062 (2002-09-01), Luparello
patent: 2003/0061260 (2003-03-01), Rajkumar
patent: 2003/0158839 (2003-08-01), Faybishenko et al.
patent: 2004/0044962 (2004-03-01), Green et al.
patent: 2004/0064442 (2004-04-01), Popovitch
patent: 2004/0128285 (2004-07-01), Green et al.
patent: 2004/0225642 (2004-11-01), Squillante et al.
patent: 2004/0225644 (2004-11-01), Squillante et al.
patent: 2005/0071766 (2005-03-01), Brill et al.
patent: 2005/0086206 (2005-04-01), Balasubramanian et al.
patent: 2006/0036605 (2006-02-01), Powell et al.
patent: 2006/0069663 (2006-03-01), Adar et al.
patent: 2006/0277175 (2006-12-01), Jiang et al.
patent: WO 01/50320 (2001-07-01), None
patent: WO 01/86507 (2001-11-01), None
Cho et al. “Effective Page Refresh Policies for Web Crawlers.” ACM Transactions on Database Systems, vol. 28, No. 4, Dec. 2003, pp. 390-426.
Cho et al. “Estimating Frequency of Change.” ACM Transactions on Internet Technology, 3(3): Aug. 2003. 32 pages.
Brin, S., et al., “The Anatomy of a Large-Scale Hypertextual Search Engine,” Proceedings of ther 7th Int'l World Wide Web Conference, Brisbane, Australia, 1998.
Final Office Action issued in U.S. Appl. No. 10/853,627, on May 12, 2008.
Brandman, O., et al., “Crawler-Friendly Web Servers,” ACM Sigmetrics Performance Evalu ation Review, vol. 28, Issue 2, Sep. 2000, pp. 9-14.
Cho, J., et al., “Efficient Crawling Through URL Ordering,” Computer Networks and ISDN Systems, vol. 30, Issues 1-7, Apr. 1988, pp. 161-172.
Cho, J., “Crawling the Web: Discovery and Maintenance of Large-Scale Web Data,” PhD Thesis, Dept. Of Computer Science, Stanford University, 2001, 188 pages.
Cho, J., et al., “The Evolution of the Web and Implications for an Incremental Crawler,” Proc. of the 26thVLDB Conf., Cairo, Egypt, 2000, pp. 200-209.
Cho, J., et al., “Synchronizing a Database to Improve Freshness,” MOD 2000, Dallas, Texas, Jun. 2000, pp. 117-128.
Coffman, Jr., E.G., et al., “Optimal Robot Scheduling,” Tech. Rep. RR-3317, 1997, 19 pages.
Introna, L., et al., “Defining the Web: the Politics of Search Engines,” Computer, vol. 33, Issue 1, Jan. 2000, pp. 54-62.
Klemm, R.P., “WebCompanion: A Friendly Client-Side Web Prefetching Agent,” IEEE Transactions on Knowledge and Data Engineering, vol. 11, No. 4, Jul./Aug. 1999, pp. 577-594.
Lee, J.K.W., et al., “Intelligent Agents for Matching Information Providers and Consumers on the World-Wide Web,” Proc. of the 13thAnnual Hawaii Int'l Conf. on System Sciences, 1997, 11 pages.
Pendey, S., et al., “Monitoring the Dynamic Web to Respond to Continuous Queries,” WWW 2003, Budapest, Hungry, May 20-24, 2003, pp. 659-668.
Ali, What's Changed? Measuring Document Change in Web Crawling for Search Engines, SPIRE 2003, LNCS 2857, 2003, pp. 28-42, Springer-Verlag, Berlin, Germany.
Arasu, Searching the Web, ACM Transactions on Internet Technology, ACM Transactions on Internet Technology, vol. 1, No. 1, Aug. 2001, pp. 2-43.
Baeza-Yates, Balancing Volume, Quality and Freshness in Web Crawling, Center for Web Research, Dept. of Computer Science, University of Chile, 2002, pp. 1-10.
Brusilovsky, Map-Based Horizontal Navigation in Education Hypertext, ACM Press, Jun. 2002, pp. 1-10.
Bullot, A Data-Mining Approach for Optimizing Performance of an Incremental Crawler, WI '03, Oct. 13-17, 2003, pp. 610-615.
Douglis, Rate of Change and Other Metrics: a Live Study of the World Wide Web, USENIX Symposium on Internetworking Technologies and Systems, Monterey, CA, Dec. 1997, pp. I and 1-14.
Douglis, The At&T Internet Difference Engine: Tracking and Viewing Changes on the Web, World Wide Web, vol. 1, No. 1, Mar. 1998, pp. 27-44.
Fetterly, A Large-Scale Study of the Evolution of Web Pages, WWW 2003, Budapest, Hungary, May 20-24, 2003, pp. 669-678.
Haveliwala, Topic-Sensitive PageRank, WWW2002, Honolulu, HI, May 7-11, 2002, 10 pages.
Henzinger, Web Information Retrieval—an Algorithmic Perspective, ESA 2000, LNCS 1879, 2000, pp. 1-8, Springer-Verlag, Berlin, Germany.
Heydon, Mercator: A Scalable, Extensible Web Crawler, World Wide Web, vol. 2, No. 4, Dec. 1999, pp. 219-229.
Hirai, WebBase: a Repository of Web Pages, Computer Networks, vol. 33, Jun. 2000, pp. 277-293.
Jeh, Scaling Personalized Web Search, WWW2003, Budapest, Hungary, May 20-24, 2003, pp. 271-279.
Kamvar, Exploiting the Block Structure of the Web for Computing PageRank, Stanford University Technical Report, 2003, 13 pages.
Najork, Breadth-First Search Crawling Yields High-Quality Pages, WWW10, May 1-5, 2001, pp. 114-118.
Shkapenyuk, Design and Implementation of a High-Performance Distributed Web Crawler, ICDE '02, San Jose, CA, Feb. 26-Mar. 1, 2002, pp. 357-368.
Suel, Odissea: A Peer-to-Peer Architecture for Scalable Web Search and Information Retrieval, WebDB, San Diego, CA, Jun. 12-13, 2003, pp. 1-6.
Wolf, Optimal Crawling Strategies for Web Search Engines, WWW 2002, Honolulu, Hawaii, May 7-11, 2002, pp. 136-147.
Offic

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Minimizing visibility of stale content in web searching... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Minimizing visibility of stale content in web searching..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Minimizing visibility of stale content in web searching... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2700308

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.