Automatic training of character templates using a text line imag

Image analysis – Learning systems – Trainable classifiers or pattern recognizers

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

382228, G06K 962

Patent

active

055948090

ABSTRACT:
A technique for automatically producing, or training, a set of bitmapped character templates defined according to the sidebearing model of character image positioning uses as input a text line image of unsegmented characters, called glyphs, as the source of training samples. The training process also uses a transcription associated with the text line image, and an explicit, grammar-based text line image source model that describes the structural and functional features of a set of possible text line images that may be used as the source of training samples. The transcription may be a literal transcription of the line image, or it may be nonliteral, for example containing logical structure tags for document formatting and layout, such as found in markup languages. Spatial positioning information modeled by the text line image source model and the labels in the transcription are used to determine labeled image positions identifying the location of glyph samples occurring in the input line image, and the character templates are produced using the labeled image positions. In another aspect of the technique, a set of character templates defined by any character template model, such as a segmentation-based model, is produced using the grammar-based text line image source model and specifically using a tag transcription containing logical structure tags for document formatting and layout. Both aspects of the training technique may represent the text line image source model and the transcription as finite state networks.

REFERENCES:
patent: 4599692 (1986-07-01), Tan et al.
patent: 5020112 (1991-05-01), Chou
patent: 5237627 (1993-08-01), Johnson et al.
patent: 5303313 (1994-04-01), Mark et al.
patent: 5321773 (1994-06-01), Kopec et al.
patent: 5333275 (1994-07-01), Wheatley et al.
patent: 5526444 (1996-06-01), Kopec et al.
G. Kopec, "Least-Squares Font Metric Estimation from Images", in IEEE Transactions on Image Processing, Oct., 1993, pp. 510-519.
G. Kopec and P. Chou, "Document Image Decoding Using Markov Source Models." in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 16, No. 6, Jun. 1994, pp. 602-617.
P. A. Chou, "Recognition of Equations Using a Two-Dimensional Stochastic Context-Free Grammar," in SPIE, vol. 1199, Visual Communications and Image Processing IV, 1989, pp. 852-863.
Huang, Ariki and Jack, Hidden Markov Models for Speech Recognition Edinburgh University Press, 1990, chapters 2, 5 and 6, pp. 10-51; 136-166; and 167-185.
L. Rabiner and B. Juang, "An Introduction to Hidden Markov Models", in IEEE ASSP Magazine, Jan. 1986, at pp. 4-16.
L. Rabiner, "A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition", in Proceedings of the IEEE, vol. 77, No. 2, Feb., 1989, at pp. 257-285.
H. S. Baird, "A Self-Correcting 100-Font Classifier," in SPIE vol. 2181 Document Recognition, 1994, pp. 106-115.
S. Kuo and O. E. Agazzi, "Keyword spotting in poorly printed documents using pseudo 2D hidden Markov models," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 16, No. 8, Aug., 1994, pp. 842-848.
C. Bose and S. Kuo, "Connected and degraded text recognition using a hidden Markov model," in Proceedings of the International Conference on Pattern Recognition, Netherlands, Sep. 1992, pp. 116-119.
E. Levin and R. Pieraccini, "Dynamic planar warping for optical character recognition," in Proceedings of the 1992 International Conference on Acoustics, Speech and Signal Processing (`ICASSP`), San Francisco, California, Mar. 23-26, 1992, pp. III-149-III-152.
C. Yen and S. Kuo, "Degraded document recognition using pseudo 2D hidden Markov models in gray-scale images". Copy received from authors without obligation of confidentiality, upon general request by applicants for information about ongoing or new work in this field. Applicants have no knowledge as to whether subject matter in this paper has been published, Aug. 3, 1994, pp. 1-19.
R. Rubenstein, Digital Typography: An Introduction to Type and Composition for Computer System Design, Addison-Wesley, 1988, pp. 115-121.
Adobe Systems, Inc. Postscript Language Reference Manual, Addison-Wesley, 1985, pp. 95-96.
J. Coombs, A. Renear and S. DeRose, "Markup Systems and the Future of Scholarly Text Processing", Comm. of the ACM, vol. 30, No. 11, Nov., 1987, pp. 933-947.
P. A. Chou and G. E. Kopec, "A Stochastic Attribute Grammar Model of Document Production and Its Use in Document Image Decoding", conference paper presented at IS&T/SPIE 1994 Intl. Symposium on Electronic Imaging, San Jose, CA, Feb. 5-10, 1995, 8 pages.
D. E. Knuth, TEX and METAFONT:New Directions in Typesetting, Digital Press, 1979, Part II, pp. 41-50.
O. E. Agazzi et al., "Connected and Degraded Text Recognition Using Planar Hidden Markov Models," in Proceedings of International Conference on Acoustics, Speech and Signal Processing (`ICASSP`)1993, Apr. 1993, pp. V-113-V-116.
K. Y. Wong, R. G. Casey and F. M. Wahl, "Document Analysis System", IBM J Res Develop., vol. 26, No. 6, Nov. 1982, pp. 647-656.
Thomas M. Breuel, "A system for the off-line recognition of handwritten text" in Proceedings of the International Conference on Pattern Recognition (ICPR), Oct. 9-13, 1994, Jerusalem, Israel, pp. 129-134.
F. Chen, L. Wilcox, and D. Bloomberg, "Word Spotting in scanned images using Hidden Markov Models," International Conference on Acoustics, Speech, and Signal Processing, Minneapolis, Apr., 1993, pp. 1-4.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Automatic training of character templates using a text line imag does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Automatic training of character templates using a text line imag, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Automatic training of character templates using a text line imag will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-1394810

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.