Method for organizing structurally similar web pages from a...

Data processing: database and file management or data structures – Database and file access – Search engines

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S741000, C707S748000, C707S754000, C707S758000, C707S797000

Reexamination Certificate

active

07941420

ABSTRACT:
Techniques are described for organizing structurally similar web pages for a website. Fingerprints are made of the structure of the web pages using shingling by placing the web page's HTML tags and attributes in sequence and encoding the tags and attributes using a standard encoding technique. Fixed-size portions of the encoded sequence are taken and a set of values extracted using independent hash functions to compute the shingles. Alternatively, a DOM tree representation of HTML of the web page is generated and each path of the DOM tree encoded and values extracted using independent hash functions to compute the shingles. A specified number of shingles are retained as the fingerprint. The pages are then clustered based upon the URL and the similarity of the shingles. The clustered hierarchal organization of pages is further pruned by various criteria including similarity of shingles or support of the cluster node in the hierarchy.

REFERENCES:
patent: 5999929 (1999-12-01), Goodman
patent: 6119124 (2000-09-01), Broder et al.
patent: 6487555 (2002-11-01), Bharat et al.
patent: 6523026 (2003-02-01), Gillis
patent: 6629097 (2003-09-01), Keith
patent: 6654741 (2003-11-01), Cohen et al.
patent: 6658423 (2003-12-01), Pugh et al.
patent: 6895552 (2005-05-01), Balabanovic et al.
patent: 7098815 (2006-08-01), Samuels et al.
patent: 7363311 (2008-04-01), Fujita et al.
patent: 7440968 (2008-10-01), Oztekin et al.
patent: 7599931 (2009-10-01), Shi et al.
patent: 2002/0159642 (2002-10-01), Whitney
patent: 2003/0140033 (2003-07-01), Lizuka et al.
patent: 2003/0149581 (2003-08-01), Chaudhri et al.
patent: 2003/0187837 (2003-10-01), Culliss
patent: 2004/0260676 (2004-12-01), Douglis et al.
patent: 2005/0004910 (2005-01-01), Trepess
patent: 2005/0010599 (2005-01-01), Kake et al.
patent: 2005/0033733 (2005-02-01), Shadmon et al.
patent: 2006/0041635 (2006-02-01), Alexander et al.
patent: 2006/0064471 (2006-03-01), Hewett et al.
patent: 2006/0123230 (2006-06-01), Hewett et al.
patent: 2006/0195297 (2006-08-01), Kubota et al.
patent: 2007/0050338 (2007-03-01), Strohm et al.
patent: 2007/0094615 (2007-04-01), Endo et al.
patent: 2007/0130318 (2007-06-01), Roast
patent: 2008/0044016 (2008-02-01), Henzinger
patent: 2008/0072140 (2008-03-01), Vydiswaran et al.
patent: 2008/0114800 (2008-05-01), Gazen et al.
patent: 2008/0134220 (2008-06-01), Weiss et al.
patent: 2008/0162541 (2008-07-01), Oresic et al.
patent: 2008/0281816 (2008-11-01), Kim
patent: 2009/0024606 (2009-01-01), Schilit et al.
patent: 2009/0043797 (2009-02-01), Dorie et al.
patent: 2009/0063538 (2009-03-01), Chitrapura et al.
patent: 2009/0070872 (2009-03-01), Cowings et al.
patent: 2009/0157644 (2009-06-01), Gollapudi et al.
patent: 2009/0164411 (2009-06-01), Dasdan et al.
patent: 2009/0171986 (2009-07-01), Chitrapura et al.
patent: 2009/0182821 (2009-07-01), Allen et al.
patent: 2010/0161717 (2010-06-01), Albrecht et al.
patent: 2010/0169329 (2010-07-01), Frieder et al.
patent: 2010/0198864 (2010-08-01), Ravid et al.
patent: 2010/0287466 (2010-11-01), Ravid et al.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method for organizing structurally similar web pages from a... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method for organizing structurally similar web pages from a..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for organizing structurally similar web pages from a... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2620111

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.