System and method for automatically classifying heterogeneous bu

Image analysis – Histogram processing – For setting a threshold

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

382 30, 382 61, G06K 946

Patent

active

052934292

ABSTRACT:
Business forms are a special class of documents typically used to collect or distribute data; they represent a vast majority of the paperwork need to conduct business. The present invention provides a pattern recognition system that classifies digitized images of business forms according to a predefined set of templates. The process involves a training phase, during which images of the template forms are scanned, analyzed and stored in a data dictionary, and a recognition phase, during which images of actual forms are compared to the templates in the dictionary to determine their class membership. The invention provides the feature extraction and matching methods, as well as the organization of the form dictionary. The performance of the system was evaluated using a collection of computer generated test forms. The methodology for creating these forms, and the results of the evaluation are also described. Business forms are characterized by the presence of horizontal and vertical lines that delimit the useable space. The present invention identifies these so called regular lines in bi-level digital images to separate text from graphics before applying an optical character recognizer; or as a feature extractor in a form recognition system. The approach differs from existing vectorization, line extraction, and text-graphics separation methods, in that it focuses exclusively on the recognition of horizontal and vertical lines.

REFERENCES:
patent: 4300123 (1981-11-01), McMillin et al.
patent: 4949392 (1990-08-01), Barski et al.
patent: 5038392 (1991-08-01), Morris et al.
patent: 5140650 (1992-08-01), Casey et al.
S. Mori and T. Sakura, Line Filtering and its Application to Stroke Segmentation of Handprinted Chinese Characters, Proceedings of the Seventh International Conferernce on Pattern Recognition, pp. 366-369, 1984.
Pavlidis, T., A Vectorizer and Feature Extractor for Document Recognition, Computer Vision, Graphics, and Image Processing, No. 35, pp. 111-127, 1986.
H. Bunke, Automatic Interpretation of Text and Graphics in Circuit Diagrams, Pattern Recognition Theory Applications, J. Kittler, K. S. Fu and L. F. Pau Editors, D. Reidel, Boston, pp. 297-310, 1982.
M. Karima, K. S. Sadah, and T. O. McNeil, From Paper Drawings to Computer Aided Design, IEEE Computer Graphics and Applications, pp. 22-39, Feb. 1985.
L. A. Fletcher and R. Katsuri, Segmentation of Binary Image into Text Strings and Graphics, SPIE vol. 768 Applications of Artificial Intelligence, pp. 533-540, 1987.
C. C. Shih, R. Katsuri, Generation of a Line Description File for Graphics Recognition, SPIE vol. 937, Applications of Artificial Intelligence, pp. 568-575, 1988.
W. K. Pratt, Digital Image Processing, Wiley, New York, pp. 523-525, 1978.
R. G. Casey, D. R. Ferguson, Intelligent Forms Processing, IBM Systems Journal, vol. 29, No. 3, 1990, pp. 435-450.
Autoclass Brochure, Visionshape, publicly avaliable Apr. 30, 1991.
IntelliForm Brochure, Executive Technologies, Inc., publicly available Apr. 30, 1991.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and method for automatically classifying heterogeneous bu does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System and method for automatically classifying heterogeneous bu, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for automatically classifying heterogeneous bu will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-158957

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.