System for extracting attached text

Image analysis – Pattern recognition – Feature extraction

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

382198, G06K 948

Patent

active

061577383

ABSTRACT:
A method for identifying and extracting text data from a table-cell frame. The method includes the steps of tracing connected components of a document image, tracing white contours within a connected component, defining a frame outline based on the white contours, identifying unattached character data inside the frame outline, and defining an initial rectangular area inside the frame outline. The method further includes detecting black pixels in a horizontal or vertical direction from the initial rectangular area in order to create an extended character area, locating boundary pixels lying inside the extended character area for each white contour, identifying black pixels positioned between boundary pixels lying inside the extended character area, combining black pixels positioned between boundary pixels lying inside the extended character area so as to form at least one connected component, recognizing the at least one connected component as a text component if it is not recognized as a vertical line, as a horizontal line, as part of a broken line, or as part of the frame, and defining a character node of a hierarchical tree structure corresponding to the extended character area and containing both the at least one connected component and any identified unattached connected components.

REFERENCES:
patent: 4377803 (1983-03-01), Lotspiech et al.
patent: 4926490 (1990-05-01), Mano
patent: 5588072 (1996-12-01), Wang
patent: 5848186 (1998-12-01), Wang et al.
R. G. Casey, et al., "Intelligent Forms Processing", IBM Systems Journal, vol. 29, No. 3, 1990, pp. 435-450.
O. Hori, et al., "Table-Form Structure Analysis Based on Box-Driven Reasoning", IEICE Trans. Inf. & Syst., vol. E79-D, No. 5, May 5, 1995, pp. 542-547.
O. Iwaki, et al., "A Segmentation Method Based on Office Document Hierarchical Structure", Proceedings of the 1987 Institute of Electrical and Electronics Engineers International Conference on Systems, Man, and Cybernetics, vol. 2, Oct. 20-23, 1987, pp. 759-763.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System for extracting attached text does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System for extracting attached text, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System for extracting attached text will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-968598

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.