Method and apparatus for extracting structured data from...

Data processing: presentation processing of document – operator i – Presentation processing of document – Layout

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C715S252000

Reexamination Certificate

active

07073122

ABSTRACT:
A method and apparatus for extracting structured data from HTML pages whereby an HTML file belonging to a pre-determined class of HTML files can be transformed into an instance tree (142). Other than the HTML file, there are two other inputs to the extraction procedure: a set of constraints (134), and a structure template (140). The steps in the process include: parsing the HTML file, thereby creating a parse tree (126); annotating the parse tree, thereby creating an annotated parse tree (130); creating an array of nodes from the annotated parse tree using a set of constraints (134); and generating an instance tree (142) from the array of nodes using the structure template (140). The instance tree (142) encodes, in a form that may be used by other computer programs, all the relevant information in the HTML file as prescribed by the set of constraints (134) and makes explicit the structure of this information.

REFERENCES:
patent: 5079700 (1992-01-01), Kozoll et al.
patent: 5113341 (1992-05-01), Kozol et al.
patent: 5140521 (1992-08-01), Kozol et al.
patent: 5276793 (1994-01-01), Borgendale et al.
patent: 5343554 (1994-08-01), Koza et al.
patent: 5379373 (1995-01-01), Hayashi et al.
patent: 5530852 (1996-06-01), Meske, Jr. et al.
patent: 5553216 (1996-09-01), Yoshioka et al.
patent: 5557720 (1996-09-01), Brown, Jr. et al.
patent: 5557722 (1996-09-01), DeRose et al.
patent: 5644776 (1997-07-01), DeRose et al.
patent: 5649186 (1997-07-01), Ferguson
patent: 5671416 (1997-09-01), Elson
patent: 5680619 (1997-10-01), Gudmundson et al.
patent: 5708806 (1998-01-01), DeRose et al.
patent: 5758361 (1998-05-01), van Hoff
patent: 5784608 (1998-07-01), Meske, Jr. et al.
patent: 5794006 (1998-08-01), Sanderman
patent: 5826256 (1998-10-01), Devanbu
patent: 5907704 (1999-05-01), Gudmundson et al.
patent: 5907837 (1999-05-01), Ferrel et al.
patent: 5920879 (1999-07-01), Kyojima et al.
patent: 5923738 (1999-07-01), Cardillo, IV et al.
patent: 5926823 (1999-07-01), Okumura et al.
patent: 5930341 (1999-07-01), Cardillo, IV et al.
patent: 5937041 (1999-08-01), Cardillo, IV et al.
patent: 5953322 (1999-09-01), Kimball
patent: 5953732 (1999-09-01), Meske, Jr. et al.
patent: 5970490 (1999-10-01), Morgenstern
patent: 5978579 (1999-11-01), Buxton et al.
patent: 5983248 (1999-11-01), DeRose et al.
patent: 6041331 (2000-03-01), Weiner et al.
patent: 6065024 (2000-05-01), Renshaw
patent: 6081815 (2000-06-01), Spitznagel et al.
patent: 6083276 (2000-07-01), Davidson et al.
patent: 6093215 (2000-07-01), Buxton et al.
patent: 6128655 (2000-10-01), Fields et al.
patent: 6421656 (2002-07-01), Cheng et al.
patent: 6424980 (2002-07-01), Iizuka et al.
patent: 6651108 (2003-11-01), Popp et al.
patent: 6748374 (2004-06-01), Madan et al.
patent: 6763343 (2004-07-01), Brooke et al.
patent: 6782505 (2004-08-01), Miranker et al.
patent: 2001/0018698 (2001-08-01), Uchino et al.
patent: 2001/0054172 (2001-12-01), Tuatini
patent: 2002/0073074 (2002-06-01), Sweet et al.
patent: 2005/0027512 (2005-02-01), Waise
patent: 0 539 120 (1993-04-01), None
patent: 0 718 783 (1996-06-01), None
Miller et al., A Novel Use of Statistical Parsing to Extract Information from Text, ACM Apr. 2000, pp. 226-233.
Magernan, Statistical Decision-Tree Models for Parsing, ACM Jun. 1995, pp. 276-283.
Joshi et al., Phrase Structure Trees Bear More Fruit than You Would Have Thought, American Journal of Computational Languistics, Mar. 1982, pp. 1-11.
Purtilo et al., Parse-Tree Annotations, ACM Dec. 1989, pp. 1467-1477.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for extracting structured data from... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus for extracting structured data from..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for extracting structured data from... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3563926

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.