Web forum crawler

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000

Reexamination Certificate

active

07599931

ABSTRACT:
A crawling system crawls a web site initially in a pattern detection phase and subsequently in a pattern usage phase. The pattern detection phase attempts to identify patterns of references to pages that contain informational content of interest and patterns of references to pages that contain little informational content of interest. During the pattern usage phase, the crawling system crawls the web site. When the crawling system encounters a reference contained on an accessed page, the crawling system determines whether the reference matches a reference pattern. If the reference matches a reference pattern associated with pages that contain informational content of interest, the crawling system accesses the referenced page. If, however, the reference matches a reference pattern of pages with little informational content, then the crawling system discards that reference without accessing the referenced page.

REFERENCES:
patent: 6304864 (2001-10-01), Liddy et al.
patent: 6377984 (2002-04-01), Najork et al.
patent: 6418433 (2002-07-01), Chakrabarti et al.
patent: 6418452 (2002-07-01), Kraft et al.
patent: 6418453 (2002-07-01), Kraft et al.
patent: 6547829 (2003-04-01), Meyerzon et al.
patent: 6631369 (2003-10-01), Meyerzon et al.
patent: 6941300 (2005-09-01), Jensen-Grey
patent: 7003528 (2006-02-01), Dan et al.
patent: 7093012 (2006-08-01), Olstad et al.
patent: 7139747 (2006-11-01), Najork
patent: 7299219 (2007-11-01), Green et al.
patent: 7305610 (2007-12-01), Dean et al.
patent: 7308643 (2007-12-01), Zhu et al.
patent: 7310632 (2007-12-01), Meek et al.
patent: 7310658 (2007-12-01), Giles et al.
patent: 7383299 (2008-06-01), Hailpern et al.
patent: 2002/0087573 (2002-07-01), Reuning et al.
patent: 2004/0225642 (2004-11-01), Squillante et al.
patent: 2005/0086206 (2005-04-01), Balasubramanian et al.
Aggarwal, Charu, Fatima Al-Garawl and Philip S. Yu, “Intelligent Crawling on the World Wide Web with Arbitrary Predicates,” WWW10, Hong Kong, May 2001, pp. 96-105.
Bergmark, Donna, Carl Lagoze and Alex Sbityakov, “Focused Crawls, Tunneling, and Digital Libraries,” Proceedings of the 6th European Conference on Digital Libraries, Sep. 2002, 16 pages.
Brin, Sergey and Lawrence Page, “The Anatomy of a Large-Scale Hypertextual Web Search Engine,” Proceedings of the 7th International World Wide Web Conference, Apr. 1998, http://www7.scu.edu.au/programme/fullpapers/1921/com1921.htm, pp. 1-26.
Broder, Andrei Z., “On the resemblance and containment of documents,” Proceedings of the Compression and Complexity of Sequences, Washington, DC, 1997, pp. 1-9.
Broder, Andrei Z., “Some applications of Rabin's fingerprinting method,” Sequences II: Methods in Communications, Security, and Computer Science, Springer-Verlag, 1993, pp. 1-10.
Burner, Mike, “Crawling towards Eternity—Building An Archive of The World Wide Web,” New Architect, CMP Media, Archives May 1997, 7 pages.
Chakrabarti, Soumen et al., “The Structure of Broad Topics on the Web,” WWW2002, Honolulu, Hawaii, 13 pages.
Chakrabarti, Soumen, Kunal Punera and Mallela Subramanyam, “Accelerated Focused Crawling through Online Relevance Feedback,” WWW2002, Honolulu, Hawaii, 12 pages.
Chakrabarti, Soumen, Martin van den Berg and Byron Dom, “Focused crawling: a new approach to topic-specific Web resource discovery,” 8th International World Wide Web Conference, 1999, Toronto, Canada, © 1999 Published by Elsevier Science B.V., pp. 545-562.
Cho, Junghoo, Hector Garcia-Molina and Lawrence Page, “Efficient Crawling Through URL Ordering,” 7th International World Wide Web Conference, May 1998, 20 pages.
De Bra, Dr. P.M.E and Dr. R.D.J. Post, “Information Retrieval in the World-Wide Web: Making Client-based searching feasible,” 1st International World Wide Web Conference, May 1994, 10 pages.
Diligenti, M. et al., “Focused Crawling Using Context Graphs,” Proceedings of the 26th VLDB Conference, Cairo, Egypt, 2000, 8 pages.
Edwards, Jenny, Kevin McCurley and John Tomlin, “An Adaptive Model for Optimizing Performance of an Incremental Web Crawler,” WWW10, May 2001, Hong Kong, 16 pages.
Google search engine, http://www.google.com [last accessed May 23, 2006].
Hersovici, Michael et al., “The shark-search algorithm—An application: tailored Web site mapping,” Proceedings of the 7th International World Wide Web Conference, Apr. 1998, 10 pages.
Heydon, Allan and Marc Najork, “Mercator: A Scalable, Extensible Web Crawler,” Jun. 26, 1999, http://www.research.compaq.com/SRC/mercator/papers/www/paper.html, 14 pages.
Johnson, Judy, Kostas Tsioutsiouliklis and C. Lee Giles, “Evolving Strategies for Focused Web Crawling,” Proceedings of the 20th International Conference on Machine Learning, Washington, DC, 2003, 8 pages.
Liu, Hongyu, Evangelos Milios and Jeannette Janssen, “Probabilistic Models for Focused Web Crawling,” WIDM'04, Nov. 2004, Washington, DC, © 2004 ACM, pp. 16-22.
Raghavan, Sriram and Hector Garcia-Molina, “Crawling the Hidden Web,” 2001, Proceedings of the 27th International Conference on Very Large Databases, pp. 1-25.
Rennie, Jason and Andrew Kachites McCallum, “Using Reinforcement Learning to Spider the Web Efficiently,” Proceedings of the 16th International Conference on Machine Learning, Bled, Slovenia, 1999, 10 pages.
Yih, Wen-tau, Po-hao Chang and Wooyoung Kim, “Mining Online Deal Forums for Hot Deals,” 2004, IEEE/WIC/ACM International Conference on Web Intelligence, 7 pages.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Web forum crawler does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Web forum crawler, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Web forum crawler will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-4134612

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.