Automatic extraction of metadata using a neural network

Data processing: database and file management or data structures – Database design – Data structure types

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

707102, 706 20, 706934, G06F 1730

Patent

active

060443758

ABSTRACT:
A method of automatically extracting metadata from a document. The method of the invention provides a computer readable document that includes blocks comprised of words, an authority list that includes common uses of a set of words, and a neural network trained to extract metadata from groupings of data called compounds. Compounds are created with one compound describing each of the blocks. Each compound includes the words making up the block, descriptive information about the blocks, and authority information associated with some of the words. The descriptive information may include such items as bounding box information, describing the size and position of the block, and font information, describing the size and type of font the words of the block use. The authority information is located by comparing each the words from the block to the authority list. The compounds are processed through the neural network to generate metadata guesses including word guesses, compound guesses and document guesses along with confidence factors associated with the guesses indicating the likelihood that each of the guesses is correct. The method may additionally include providing a document knowledge base of positioning information and size information for metadata in known documents. If the document knowledge base is provided, then the method includes deriving analysis data from the metadata guess and comparing the analysis data to the document knowledge base to determine metadata output.

REFERENCES:
patent: 4758980 (1988-07-01), Tsunekawa et al.
patent: 4912653 (1990-03-01), Wood
patent: 5204812 (1993-04-01), Kasiraj et al.
patent: 5235654 (1993-08-01), Anderson et al.
patent: 5265242 (1993-11-01), Fujisawa et al.
patent: 5390259 (1995-02-01), Withgott et al.
patent: 5414781 (1995-05-01), Spitz et al.
patent: 5416849 (1995-05-01), Huang
patent: 5418946 (1995-05-01), Mori
patent: 5463773 (1995-10-01), Sakakibara et al.
patent: 5475768 (1995-12-01), Diep et al.
patent: 5493677 (1996-02-01), Balogh et al.
patent: 5521991 (1996-05-01), Billings
patent: 5568640 (1996-10-01), Nishiyama et al.
patent: 5574802 (1996-11-01), Ozaki
patent: 5621818 (1997-04-01), Tashiro
patent: 5628003 (1997-05-01), Fujisawa et al.
patent: 5642288 (1997-06-01), Leung et al.
patent: 5642435 (1997-06-01), Loris
patent: 5675710 (1997-10-01), Lewis
patent: 5924090 (1999-07-01), Krellenstein
patent: 5937084 (1999-08-01), Crabtree et al.
patent: 5970482 (1999-10-01), Pham et al.
C.W. Dawson et al., "Automatic Classification of Office Documents: Review of Available Methods and Techniques", Records Management Quarterly, Oct. 1995, pp. 3-18.
D. Savic, Automatic Classification of Office Documents: Review of Available Methods and Techniques, Records Management Quarterly, Oct. 1995, pp. 3-18.
S. Weibel et al., Automated Title Page Cataliging: A Feasibility Study, Information Processing and Management, vol. 25, No. 2, 1989, pp. 187-203.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Automatic extraction of metadata using a neural network does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Automatic extraction of metadata using a neural network, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Automatic extraction of metadata using a neural network will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-1334960

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.