Method for segmenting webpages by parsing webpages into...

Data processing: artificial intelligence – Machine learning

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

07974934

ABSTRACT:
A method of segmenting a webpage into visually and semantically cohesive pieces uses an optimization problem on a weighted graph, where the weights reflect whether two nodes in the webpage's DOM tree should be placed together or apart in the segmentation; the weights are informed by manually labeled data.

REFERENCES:
patent: 2009/0248608 (2009-10-01), Ravikumar et al.
Nir Ailon, Department of Computer Science, Princeton University, Princeton, NJ, Moses Charikar, Department of Computer Science, Princeton University, Princeton, NJ, Alantha Newman, Department of Computer Science, RWTH, Aachen, Germany, “Aggregating Inconsistent Information: Ranking and Clustering”, STOC '05, May 22-24, 2005, Baltimore, Maryland, USA, Copyright 2005, ACM 1-58113-960-8/05/0005; pp. 684-693.
Shumeet Baluja, Google, Inc., 1600 Amphitheatre Parkway, Mountain View, CA, “Browsing on Small Screens: Recasting Web-Page Segmentation into an Efficient Machine Learning Framework”, Copyright is held by the International World Wide Web Conference Committee (IW3C2), WWW 2006, May 23-26, 2006, Edinburgh, Scotland, ACM 1-59593-323-9/06/0005; 10 pages not numbered on article.
Krishna Bharat, Compaq SRC, 130 Lytton Ave., Palo Alto, CA, Andrei Broder, Alta Vista Company, 1825 S. Grant St., San Mateo, CA, Jeffrey Dean, Google, Inc., 165 University Ave., Palo Alto, CA, ; Monika R. Henzinger, Compaq SRC 130 Lytton Ave., Palo Alto, CA, “A Comparison of Techniques to Find Mirrored Hosts on the WWW”, Aug. 25, 1999. This work was presented at the Workshop on Organizing Web Space at the Fourth ACM Conference on Digital Libraries 1999; pp. 1-19.
Yuri Boykov, Olga Veksler, Ramin Zabih, Computer Science Department, Cornell University, Ithaca, NY, “Fast Approximate Energy Minimization via Graph Cuts”, PAMI, 23(11):1222-1239, Nov. 2001; 8 pages not numbered on article.
Andrei Z. Broder, Steven C. Glassman, Mark S. Manasse, Geoffrey Zweig, Systems Research Center, 130 Lytton Avenue, Palo Alto, CA, “Syntactic Clustering of the Web”, SRC Technical Note, 1997-015, Jul. 25, 1997, Copyright 1997 Digital Equipment Corporation, WWW6/Computer Networks, 29(8-13):1157-1166, 1997, pp. 1-13.
Deng Cai, Tsinghua University, Beijing, P.R. China, Shipeng Yu, Peking University, Beijing, P.R. China, Ji-Rong Wen and Wei-Ying Ma, Microsoft Research Asia, “Extracting Content Structure for Web Pages based on Visual Representation”, In 5th Asia Pacific Web Conference, pp. 406-415, 2003, 12 pages not numbered on article.
Deepayan Chakrabarti, Yahoo! Research, 701 First Ave., Sunnyvale, CA; Ravi Kumar, Yahoo! Research, 701 First Ave., Sunnyvale, CA; Kunal Punera, Dept. of ECE, Univ. Texas at Austin, Austin, TX; “Page-level Template Detection via Isotonic Smoothing”, Copyright is held by the International World Wide Web Conference Committee (IW3C2), WWW 2007, May 8-12, 2007, Banff, Alberta, Canada, ACM 978-1-59593-654-7/07/0005, pp. 61-70.
Yu Chen, Xing Xie, Wei-Yang Ma, Hong-Jiang Zhang, Microsoft Research, Asia; “Adapting Web Pages for Small-Screen Devices”, Web-page Transformation, Jan.-Feb. 2005, Published by the IEEE Computer Society, 1089-7801/05, Copyright 2005 IEEE, IEEE Internet Computing 9(1):50-56, 2005, pp. 2-8.
David Gibson, IBM Almaden Research Center, 650 Harry Road, San Jose, CA, Kunal Punera, Dept. of Electrical and Computer Engineering, University of Texas at Austin, Austin, TX, Andrew Tomkins, IBM Almaden Research Center, 650 Harry Road, San Jose, CA, “The Volume and Evolution of Web Page Templates”, Copyright is held by the International World Wide Web Conference Committee (IW3C2). WWW 2005, May 10-14, 2005, Chiba, Japan, ACM 1-59593-051-5/05/0005. In Proc. 14th WWW (Special interest tracks and posters), 2005, pp. 830-839.
Lawrence Hubert, The University of California, Santa Barbara, CA, Phipps Arabie, University of Illinois at Champaign, “Comparing Partitions”, Journal of Classification 2:193-218 (1985), Springer-Verlag New York Inc., pp. 193-218.
Hung-Yu Kao, Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan, ROC, Jan-Ming Ho, Institute of Information Science, Academia Sinica, Taipei, Taiwan, ROC, Ming-Syan Chen, Graduate Institute of Communication Engineering and the Electrical Engineering Department, National Taiwan University, Taipei, Taiwan, ROC, “Wisdom: Web Intrapage Informative Structure Mining Based on Document Object Model”, Published online Mar. 17, 2005, IEEECS, Log No. TKDE-0180-0903, 1041-4347/05, Copyright.
Jon Kleinberg, Department of Computer Science, Cornell University, Ithaca, NY, Eva Tardos, Department of Computer Science, Cornell University, Ithaca, NY, “Approximation Algorithms for Classification Problems with Pairwise Relationships: Metric Labeling and Markov Random Fields”, Proceedings of the 40th Annual IEEE Symposium on the Foundations of Computer Science, Oct. 1999, J. ACM, 49(5):616-639, 2002, pp. 1-19.
Vladimir Kolmogorov, Computer Science Department, Cornell University, Ithaca, NY, Ramin Zabih, Computer Science Department, Cornell University, Ithaca, NY, “What Energy Functions Can Be Minimized via Graph Cuts?”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, No. 2, Feb. 2004, PAMI, 26(2):147-159, 2004,0162-8828/04, Copyright 2004 IEEE, Published by the IEEE Computer Society, Reference IEEECS, Log No. 118731, pp. 147-159.
Glenn W. Milligan, Faculty of Management Sciences, Martha C. Cooper, Faculty of Marketing, The Ohio State University, “A Study of the Comparability of External Criteria for Hierarchical Cluster Analysis”, Multivariate Behavioral Research, 1986, 21, 441-458, Oct. 1986, pp. 441-458.
Tom Mitchell, Fredkin Professor of Al and Machine Learning Chair, Machine Learning Department, School of Computer Science, Carnegie Mellon University, “Research: Machine Learning, Computer Science, Cognitive Neuroscience”, McGraw Hill, 1997, Copyright 1996, Tom M. Mitchell and McGraw Hill, 10 pages.
Alexander Strehl, Joydeep Ghosh, Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX, “Cluster Ensembles—A Knowledge Reuse Framework for Combining Multiple Partitions”, Editor: Claire Cardie, Copyright 2002, Alexander Strehl and Joydeep Ghosh, Journal of Machine Learning Research (JMLR), 3:583-617, Dec. 2002, pp. 583-617.
Kilian Q. Weinberger, John Blitzer, Lawrence K. Saul, Department of Computer and Information Science, University of Pennsylvania, Levine Hall, 3330 Walnut Street, Philadelphia, PA, “Distance Metric Learning for Large Margin Nearest Neighbor Classification”, In Proc. NIPS 2006, pp. 1473-1480, 2006, 8 pages not numbered on article.
Xinyi Yin, Department of Computer Science, National University of Singapore, Singapore, Wee Sun Lee, Department of Computer Science and Singapore-MIT Alliance, National University of Singapore, Singapore, Copyright is held by the author/owner(s), WWW 2004, May 17-22, 2004, New York, New York, USA, ACM 1-58113-844-X/04/0005, pp. 338-344.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method for segmenting webpages by parsing webpages into... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method for segmenting webpages by parsing webpages into..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for segmenting webpages by parsing webpages into... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2645147

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.