Data processing: database and file management or data structures – Database design – Database and data structure management
Reexamination Certificate
2011-04-19
2011-04-19
Vital, Pierre M (Department: 2156)
Data processing: database and file management or data structures
Database design
Database and data structure management
C707S811000
Reexamination Certificate
active
07930322
ABSTRACT:
Various technologies and techniques are disclosed for text based schema discovery and information extraction. Documents are analyzed to identify sections of the documents and a relationship between the sections. Statistics are stored regarding occurrences of items in the documents. A probabilistic model is generated based on the stored statistics. A database schema is generated with a plurality of tables based upon the probabilistic model. The documents are analyzed against the probabilistic model to determine how the documents map to the tables generated from the database schema. The tables are populated from the documents based on a result of the analysis against the probabilistic model.
REFERENCES:
patent: 5926811 (1999-07-01), Miller et al.
patent: 6651055 (2003-11-01), Kilmer et al.
patent: 6738767 (2004-05-01), Chung et al.
patent: 6990632 (2006-01-01), Rothchiller et al.
patent: 7072896 (2006-07-01), Lee et al.
patent: 7251777 (2007-07-01), Valtchev et al.
patent: 7428522 (2008-09-01), Raghunathan
patent: 2001/0047271 (2001-11-01), Culbert et al.
patent: 2002/0169788 (2002-11-01), Lee et al.
patent: 2003/0033333 (2003-02-01), Nishino et al.
patent: 2003/0088562 (2003-05-01), Dillon et al.
patent: 2004/0153459 (2004-08-01), Whitten et al.
patent: 2004/0260677 (2004-12-01), Malpani et al.
patent: 2005/0154690 (2005-07-01), Nitta et al.
patent: 2005/0177431 (2005-08-01), Willis et al.
patent: 2006/0117057 (2006-06-01), Legault et al.
patent: 2006/0155751 (2006-07-01), Geshwind et al.
patent: 2006/0218115 (2006-09-01), Goodman et al.
patent: 2006/0242180 (2006-10-01), Graf et al.
patent: 2007/0011183 (2007-01-01), Langseth et al.
patent: 2007/0022093 (2007-01-01), Wyatt et al.
patent: 2007/0143320 (2007-06-01), Gaurav et al.
Automatically generating OLAP schemata from conceptual graphical models, Hahn et al, DOLAP'00 11/00 McLean VA, 2000.
Cafarella, et al., “Navigating Extracted Data with Schema Discovery”, Jun. 15, 2007, pp. 1-6.
Gubanov, et al., “Structural Text Search and Comparison Using Automatically Extracted Schema”, Jun. 30, 2006, 1-6.
Hegewald, et al., “XStruct: Efficient Schema Extraction from Multiple and Large XML Documents”, IEEE, 2006, pp. 1-10.
Lise Getoor, “Structure Discovery using Statistical Relational Learning”, IEEE, 2003, pp. 1-8.
Microsoft Corporation
Obisesan Augustine
Vital Pierre M
LandOfFree
Text based schema discovery and information extraction does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Text based schema discovery and information extraction, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Text based schema discovery and information extraction will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2667504