Pseudo-anchor text extraction for vertical search

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000

Reexamination Certificate

active

07657507

ABSTRACT:
A search method uses pseudo-anchor text associated with search objects to improve search performance. The pseudo-anchor text may be extracted in combination with an identifier of the search objects (such as a pseudo-URL) from a digital corpus such as a collection of documents. Pseudo-anchor texts for each object are preferably extracted from candidate anchor blocks using a machine learning based approach. The pseudo-anchor texts are made available for searching and used to help ranking the objects in a search result to improve search performance. Method may be used in vertical search of objects such as published articles, products and images that lack explicit URL and anchor text information.

REFERENCES:
patent: 5920859 (1999-07-01), Li
patent: 6636848 (2003-10-01), Aridor et al.
patent: 6925495 (2005-08-01), Hegde et al.
patent: 2002/0169770 (2002-11-01), Kim et al.
patent: 2005/0149851 (2005-07-01), Mittal
patent: 2005/0165781 (2005-07-01), Kraft et al.
patent: 2006/0026496 (2006-02-01), Joshi et al.
patent: 2006/0074871 (2006-04-01), Meyerzon et al.
patent: 2006/0074903 (2006-04-01), Meyerzon et al.
patent: 2006/0136098 (2006-06-01), Chitrapura et al.
patent: 2006/0143254 (2006-06-01), Chen et al.
patent: PCT/EP2005/050321 (2005-01-01), None
Yin et al., Towards Understanding the Functions of Web Element, 2004, Airs, pp. 313-324.
Lu et al., Anchor Text Mining for Translation of Web Queries: A Transitive Translation Approach, 2004, ACM, pp. 242-269.
Chang et al., A Chinese-to-Chinese Statistical Machine Translation Model for Mining Synonymous Simplified-Traditional Chinese Terms, 1994, National Chi-Nan University, pp. 242-247.
Amitay, “Using Common Hypertext Links to Identify the Best Phrasal Description of Target Web Documents”, available at least as eary as Jan. 24, 2007, at <<http://einat.webir.org/sigir—98.pdf>>, pp. 1-5.
Attardi, et al., “Theseus: Categorization by Context,” Proceedings of the 8th International World Wide Web Conference, 1999, pp. 1-2.
Bikel, et al., “Nymble: A High-Performance Learning Name-Finder,” Proceedings of ANLP, 1997, pp. 194-201.
Broder, et al., “Syntactic Clustering of the Web”, retrieved at <<http://www.research.digital.com/SRC>>, SRC Technical Note, Jul. 25, 1997, Digital Equipment Corporation, 1997, pp. 1-13.
Califf, et al., “Relational Learning of Pattern-Match Rules for Information Extraction,” CoNLL97: Computational Natural Language Learning, ACL, 1997, pp. 9-15.
Chakrabarti, et al., “Automatic Resource Compilation by Analyzing Hyperlink Structure and Associated Text,” Proceedings of the 7th International World Wide Web Conference, 1998, p. 13.
“CiteSeer.IST Scientific Literature Digital Library”, available as early as Feb. 26, 2007, retrieved on Apr. 4, 2007, at <<http://citeseer.ist.psu.edu>>, 1 pg.
Collins, et al., “Unsupervised Models for Named Entity Classification,” Proceedings of the Joint SIGDAT Conference on Empiracal Methods in Natural Language Processing, 1999, pp. 100-110.
Davison, “Topical Locality in the Web”, ACM, Proceedings of SIGIR, 2000, pp. 272-279.
Freitag, “Information Extraction from HTML: Application of a General Machine Learning Approach,” Proceedings of the 15th Conference on Artificial Intelligence, 1998, 7 pgs.
Giles, et al., “CiteSeer: An Automatic Citation Indexing System”, ACM, Proceedings of the 3rd ACM Conference on Digital Libraries (DL'98), 1998, pp. 89-98.
Haveliwala, et al., “Evaluating Strategies for Similarity Search on the Web”, ACM, WWW2002, May 7-11, 2002, pp. 1-10.
Lawrence, et al., “Digital Libraries and Autonomous Citation Indexing”, IEEE, 1999, pp. 67-71, vol. 32, No. 6.
Lu, et al., “A Transitive Model for Extracting Translation Equivalents of Web Queries through Anchor Text Mining”, available at least as eary as Jan. 24, 2007, at <<http://delivery.acm.org/10.1145/1080000/1072236/p8-lu.pdf?key1=1072236&key2=0538359611&coll=GUIDE&dl=GUIDE&CFID=12180585&CFTOKEN=46372023>>, pp. 1-7.
McBryan, “GENVL and WWWW: Tools for Taming the Web”, First International Conference on the World Wide Web, CERN, May 1994, pp. 1-12, Geneva, Switzerland.
Muslea, “Extraction Patterns for Information Extraction Tasks: A Survey”, American Association for Artificial Intelligence, 1999, 6 pgs.
Nie et al, “Extracting Objects from the Web,” ICDE, 2006, pp. 1-3.
Nie, et al., “Object-Level Ranking: Bringing Order to Web Objects,” ACM, WWW2005, May 10-14, 2005, pp. 567-574.
Shi, et al., “Pseudo-Anchor Text Extraction for Vertical Search”, Microsoft Technique Report, MSR-TR-2006-122, Aug. 2006, 6 pgs.
Yu et al, “Improving Pseudo-Relevance Feedback in Web Information Retrieval Using Web Page Segmentation,” ACM, WWW2003, May 20-24, 2003, 8 pgs, Budapest, Hungary.
Zhu et al, “Simultaneous Record Detection and Attribute Labeling in Web Data Extraction,” ACM, KDD'06, Aug. 20-23, 2006, 10 pgs.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Pseudo-anchor text extraction for vertical search does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Pseudo-anchor text extraction for vertical search, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Pseudo-anchor text extraction for vertical search will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-4192785

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.