Determining semantically distinct regions of a document

Data processing: presentation processing of document – operator i – Presentation processing of document – Layout

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C715S234000

Reexamination Certificate

active

07913163

ABSTRACT:
A structured document is translated into an initial hierarchical data structure in accordance with syntactic elements defined in the structured document. The initial hierarchical data structure includes a plurality of nodes, and each node corresponds to one of the syntactic elements. The method then annotates a node with a set of attributes including geometric parameters of semantic elements in the structured document that are associated with the node in accordance with a pseudo-rendering of the structured document. Finally, the method merges the nodes in the initial hierarchical data structure into a tree of merged nodes in accordance with their respective attributes and a set of predefined rules such that each merged node is associated with a semantically distinct region of the pseudo-rendered document. The predefined rules include rules for merging nodes associated with semantic elements that have nearby positions and/or compatible attributes in the pseudo-rendered document.

REFERENCES:
patent: 6233571 (2001-05-01), Egger et al.
patent: 6356899 (2002-03-01), Chakrabarti et al.
patent: 6486898 (2002-11-01), Martino et al.
patent: 6901575 (2005-05-01), Wu et al.
patent: 6948119 (2005-09-01), Farmer et al.
Cohen et al., “A Flexible Learning System for Wrapping Tables and Lists in HTML Documents”, WWW2002, May 2002, published by ACM, p. 232-241.
Lee, et al., “Parameter-Free Geometric Document Layout Analysis”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, No. 11, Nov. 2001, p. 1240-1256.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Determining semantically distinct regions of a document does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Determining semantically distinct regions of a document, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Determining semantically distinct regions of a document will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2729824

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.