Leveraging cross-document context to label entity

Data processing: database and file management or data structures – Organization of data – Entity-attribute-value

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

07970808

ABSTRACT:
Entities, such as people, places and things, are labeled based on information collected across a possibly large number of documents. One or more documents are scanned to recognize the entities, and features are extracted from the context in which those entities occur in the documents. Observed entity-feature pairs are stored either in an in-memory store or an external store. A store manager optimizes use of the limited amount of space for an in-memory store by determining which store to put an entity-feature pair in, and when to evict features from the in-memory store to make room for new pairs. Feature that may be observed in an entity's context may take forms such as specific word sequences or membership in a particular list.

REFERENCES:
patent: 5297039 (1994-03-01), Kanaegami et al.
patent: 5619709 (1997-04-01), Caid et al.
patent: 5819258 (1998-10-01), Vaithyanathan et al.
patent: 6052693 (2000-04-01), Smith et al.
patent: 6076088 (2000-06-01), Paik et al.
patent: 6556983 (2003-04-01), Altschuler et al.
patent: 6651218 (2003-11-01), Adler et al.
patent: 7139752 (2006-11-01), Broder et al.
patent: 2004/0194009 (2004-09-01), LaComb et al.
patent: 2009/0019032 (2009-01-01), Bundschus et al.
Chandel et al.; Efficient Batch Top-k Search for Dictionary-based Entity Recognition, 2006 IEEE, pp. 1-10.
Boguraev, et al., “The effects of analysing cohesion on document summarisation”, Proceedings of the 18th conference on Computational linguistics—vol. 1. 2000. pp. 76-82.
Fuller, et al., “A knowledgebase system to enhance scientific discovery: Telemakus”, Biomedical Digital Libraries—Research. Published: Sep. 21, 2004 Biomedical Digital Libraries. pp. 1-15. Article available on: http://www.bio-diglib.com/content/1/1/2.
“MultiMatch”, Information Society Technologies, Technology-enhanced Learning and Access to Cultural Heritage Instrument: Specific Targeted Research Project. 2006. pp. 1-127.
Mei, et al., “A Mixture Model for Contextual Text Mining”, KDD'06, Aug. 20-23, 2006, Philadelphia, Pennsylvania, USA. pp. 649-655.
Downey, et al., “A Probabilistic Model of Redundancy in Information Extraction”, In. Procs. of IJCAI, 2005. 8 Pages.
Cafarella, et al., “A Search Engine for Natural Language Applications”, WWW 2005, May 10-14, 2005, Chiba, Japan.
Pang, et al., “A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts”, Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics. Barcelona, Spain Article No. 271. Year of Publication: 2004. 8 Pages.
Cormode, et al., “An Improved Data Stream Summary: The Count-Min Sketch and its Applications”, Journal of Algorithms vol. 55, Issue 1 (Apr. 2005). pp. 58-75.
McCallum, et al., “Early Results for Named Entity Recognition with Conditional Random Fields, Feature Induction andWeb-Enhanced Lexicons” Seventh Conference on Natural Language Learning (CoNLL), 2003. 4 Pages.
Platt, “Fast Training of Support Vector Machines Using Sequential Minimal Optimization”, Chapter 12, Advances in Kernel Methods—Support Vector Learning,pp. 185-208, MIT Press, (1999).
Graefe, “Implementing Sorting in Database Systems”, ACM Computing Surveys, vol. 38, No. 3, Article 10, Publication date: Sep. 2006. 37 Pages.
Dumais, et al., “Inductive Learning Algorithms and Representations for Text Categorization”, Conference on Information and Knowledge Management. Proceedings of the seventh international conference on Information and knowledge management. Bethesda, Maryland, United States. Year of Publication: 1998. pp. 148-155.
Cohen, “Information Extraction and Integration: an Overview”, Apr. 26, 2004. 81 Pages.
Grishman, “Information Extraction: Techniques and Challenges”, Lecture Notes In Computer Science; vol. 1299. International Summer School on Information Extraction: A Multidisciplinary Approach to an Emerging Information Technology. Year of Publication 1997. pp. 10-27.
Appelt, et al., “Introduction to Information Extraction Technology”, A Tutorial Prepared for IJCAI-99. Artificial Intelligence Center, SRI International,333 Ravenswood Ave, Menlo Park, CA. pp. 1-41. Dated: 1999.
Cafarella, et al., “KnowItNow: Fast, Scalable Information Extraction from the Web”, Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP), pp. 563-570, Vancouver, Oct. 2005.
Cucerzan, “Large-Scale Named Entity Disambiguation Based on Wikipedia Data”, The EMNLP-CoNLL Joint Conference. Prague, 2007. pp. 708-716.
Zhou, et al., “Named Entity Recognition using an HMM-based Chunk Tagger”, Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, Jul. 2002, pp. 473-480.
Banko, et al., “Open Information Extraction from the Web”, In. Proc. 20th IJCAI, pp. 2670-2676, Jan. 2007.
Graefe, “Query Evaluation Techniques for Large Databases”, ACM Computing Surveys. vol. 25, No. 2. Jun. 1993, pp. 1-98.
Agichtein, et al., “Querying Text Databases for Efficient Information Extraction”, Proceedings of the 19th IEEE International Conference on Data Engineering (ICDE), 2003. pp. 1-12.
Konig, et al., “Reducing the Human Overhead in Text Categorization”, KDD'06, Aug. 20-23, 2006, Philadelphia, Pennsylvania, USA. pp. 598-603.
Cafarella, et al., “Relational Web Search”, WWW2006, May 22-26, 2006, Edinburgh, UK. 9 Pages.
Agichtein, et al., “Scalable Information Extraction and Integration” Aug. 2006. 116 Pages.
Agichtein, “Scaling Information Extraction to Large Document Collections”, 2005 Bulletin of the IEEE Computer Society Technical Committee on Data Engineering. pp. 1-8.
Feldman, “Self-supervised Relation Extraction from the Web”, ISMIS 2006, LNAI 4203, pp. 755-764, 2006.
Archak, et al., “Show me the Money! Deriving the Pricing Power of Product Features by Mining Consumer Reviews”, KDD'07, Aug. 12-15, 2007, San Jose, California, USA. 10 Pages.
Rosendfeld, et al., “TEG—a hybrid approach to information extraction”, Knowledge Information Systems (2005) 00: 1-18.
Joachims, “Text Categorization with Support Vector Machines: Learning with Many Relevant Features”, LS-8, Report 23. Apr. 19, 1998. 18 Pages.
Winkler, “The State of Record Linkage and Current Research Problems”, Technical report, Statistical Research Division, U.S. Census Bureau, Washington, DC, 1999. 15 Pages.
Ipeirotis, et al., “To Search or to Crawl? Towards a Query Optimizer for Text-Centric Tasks”, SIGMOD 2006, Jun. 27-29, 2006, Chicago, Illinois, USA.12 Pages.
Joachims, “Training Linear SVMs in Linear Time”, KDD'06, Aug. 20-23, 2006, Philadelphia, Pennsylvania, USA. pp. 217-226.
Cormode, et al., “What's Hot and What's Not: Tracking Most Frequent Items Dynamically”, PODS 2003, Jun. 9-12, 2003, San Diego, CA. pp. 296-306.
Chandel, et al., “Efficient Batch Top-k Search for Dictionary-based Entity Recognition”, IEEE ICDE Conf., 2006. pp. 1-10.
Navarro, et al., “Flexible Pattern Matching in Strings”, Cambridge University Press, 2002, pp. 41-55.
Bloom, “Space/Time Tradeoffs in Hash Coding with Allowable Errors”, In Communications of the ACM 13(7), 1970. pp. 422-426.
Pekar, et al., “Discovery of Language Resources on the Web: Information Extraction from Heterogeneous Documents”, Literary and Linguistic Computing, retrieved at <<http://llc.oxfordjournals.org/content/22/3/329.full.pdf+html>>, dated Apr. 20, 2007, 329-343, vol. 22, No. 3.
Vapnik, Vladimir Naumovich, Statistical Learning Theory, dated 1998, 424-435, Wiley, New York.
Ganti, et al., “Entity Categorization Over Large Document Collections”, in Proceedin

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Leveraging cross-document context to label entity does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Leveraging cross-document context to label entity, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Leveraging cross-document context to label entity will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2731190

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.