Text based schema discovery and information extraction

Data processing: database and file management or data structures – Database design – Database and data structure management

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S811000

Reexamination Certificate

active

07930322

ABSTRACT:
Various technologies and techniques are disclosed for text based schema discovery and information extraction. Documents are analyzed to identify sections of the documents and a relationship between the sections. Statistics are stored regarding occurrences of items in the documents. A probabilistic model is generated based on the stored statistics. A database schema is generated with a plurality of tables based upon the probabilistic model. The documents are analyzed against the probabilistic model to determine how the documents map to the tables generated from the database schema. The tables are populated from the documents based on a result of the analysis against the probabilistic model.

REFERENCES:
patent: 5926811 (1999-07-01), Miller et al.
patent: 6651055 (2003-11-01), Kilmer et al.
patent: 6738767 (2004-05-01), Chung et al.
patent: 6990632 (2006-01-01), Rothchiller et al.
patent: 7072896 (2006-07-01), Lee et al.
patent: 7251777 (2007-07-01), Valtchev et al.
patent: 7428522 (2008-09-01), Raghunathan
patent: 2001/0047271 (2001-11-01), Culbert et al.
patent: 2002/0169788 (2002-11-01), Lee et al.
patent: 2003/0033333 (2003-02-01), Nishino et al.
patent: 2003/0088562 (2003-05-01), Dillon et al.
patent: 2004/0153459 (2004-08-01), Whitten et al.
patent: 2004/0260677 (2004-12-01), Malpani et al.
patent: 2005/0154690 (2005-07-01), Nitta et al.
patent: 2005/0177431 (2005-08-01), Willis et al.
patent: 2006/0117057 (2006-06-01), Legault et al.
patent: 2006/0155751 (2006-07-01), Geshwind et al.
patent: 2006/0218115 (2006-09-01), Goodman et al.
patent: 2006/0242180 (2006-10-01), Graf et al.
patent: 2007/0011183 (2007-01-01), Langseth et al.
patent: 2007/0022093 (2007-01-01), Wyatt et al.
patent: 2007/0143320 (2007-06-01), Gaurav et al.
Automatically generating OLAP schemata from conceptual graphical models, Hahn et al, DOLAP'00 11/00 McLean VA, 2000.
Cafarella, et al., “Navigating Extracted Data with Schema Discovery”, Jun. 15, 2007, pp. 1-6.
Gubanov, et al., “Structural Text Search and Comparison Using Automatically Extracted Schema”, Jun. 30, 2006, 1-6.
Hegewald, et al., “XStruct: Efficient Schema Extraction from Multiple and Large XML Documents”, IEEE, 2006, pp. 1-10.
Lise Getoor, “Structure Discovery using Statistical Relational Learning”, IEEE, 2003, pp. 1-8.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Text based schema discovery and information extraction does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Text based schema discovery and information extraction, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Text based schema discovery and information extraction will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2667504

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.