Word spotting in bitmap images using context-sensitive character

Image analysis – Pattern recognition – Template matching

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

382228, G06K 968

Patent

active

055925685

ABSTRACT:
Font-independent spotting of user-defined keywords in a scanned image. Word identification is based on features of the entire word without the need for segmentation or OCR, and without the need to recognize non-keywords. Font-independent character models are created using hidden Markov models (HMMs) and arbitrary keyword models are built from the character HMM components. Word or text line bounding boxes are extracted from the image, a set of features based on the word shape, (and preferably also the word internal structure) within each bounding box is extracted, this set of features is applied to a network that includes one or more keyword HMMs, and a determination is made. The identification of word bounding boxes for potential keywords includes the steps of reducing the image (say by 2x) and subjecting the reduced image to vertical and horizontal morphological closing operations. The bounding boxes of connected components in the resulting image are then used to hypothesize word or text line bounding boxes, and the original bitmaps within the boxes are used to hypothesize words. In a particular embodiment, a range of structuring elements is used for the closing operations to accommodate the variation of inter- and intra-character spacing with font and font size.

REFERENCES:
patent: 3969698 (1976-07-01), Bollinger et al.
patent: 4155072 (1979-05-01), Kawa
patent: 4754489 (1988-06-01), Bokser
patent: 5048109 (1991-09-01), Bloomberg et al.
patent: 5060277 (1991-10-01), Bokser
patent: 5075896 (1991-12-01), Wilcox et al.
patent: 5081690 (1992-01-01), Tan
patent: 5151951 (1992-09-01), Ueda et al.
patent: 5199077 (1993-03-01), Wilcox et al.
patent: 5201011 (1993-04-01), Bloomberg et al.
patent: 5237627 (1993-08-01), Johnson et al.
patent: 5321770 (1994-06-01), Huttenlocher
patent: 5325444 (1994-06-01), Cass et al.
patent: 5438630 (1995-08-01), Chen et al.
11TH IAPR International Conference On Pattern Recognition, vol. II,Conference B: Pattern Recognition Methodology and Systems, "Connected and Degraded Text Recognition Using Hidden Markov Model", Chinmoy B. Bose and Shyh-shiaw Kuo of AT&T Bell Labs, 1992, pp. 116-119.
Computer Vision, Graphics And Image Processing, vol. 35, 1986, pp. 111-127, T. Pavlidis "A vectorizer and feature extractor for document recognition" p. 112, line 9-14; figure 1.
Icassp-92 IEEE Int. Conf. On Acoustics, Speech And Signal Processing, vol. 2, 23 Mar. 1992, San Francisco, CA, pp. 97-100, XP356946, L. D. Wilcox et M A Bush "Training and search algorithms for an interactive wordspotting system"; figure 1.
IBM Journal Of Research And Development, vol. 26, No. 6, Nov. 1982, New York, NY, pp 681-686, N. F. Brickman, "Word AUTOCorrelation redundancy match (WARM) technology" Section 2. System overview; figure 2.
United States Postal Service Advanced Technology Conference, vol. One, Nov. 5-7, 1990, "A Word Shape Analysis Approach to Recognition of Degraded Word Images"; Tin Kam Ho, Jonathan J. Hull, Sargur N. Srihari of Department of Computer Science, State University of New York at Buffalo, pp. 217-231.
IEEE Trans. On Acoustics, Speech And Signal Processing, vol. 38, No. 11, Nov. 1990, pp. 1870-1878, J. G. Wilpon, et al., "Automatic recognition of keywords in unconstrained speech using hidden markov models", p. 1871, right column, line 18-line 46; figure 3.
Systems & Computers In Japan, vol. 21, No. 4, 1990, New York, NY, pp. 26-35, XP159200, T. Nakano, et al., "A new recognition method for stamped and painted alphanumerals" Section 2.1 Principle; figures 3-7.
Dan S. Bloomberg, "Multiresolution Morphological Approach to Document Image Analysis", Proceedings of the Int. Conf. on Document Analysis and Recognition, Saint-Malo, France, Sep. 1991, pp. 963-971.
Simon Kahan et al., "On the Recognition of Printed Characters of Any Font and Size", IEEE Transations on Pattern Analysis and Machine Intelligence, vol. PAMI-9, No. 2, Mar. 1987, pp. 274-288.
Chinmoy B. Bose et al., "Connected and Degraded Text Recognition Using Hidden Markov Model", Proceedings of the Int. Conf. on Pattern Recognition, Netherlands, Sep. 1992, pp. 116-119.
Tin Kam Ho et al., "A Word Shape Analysis Approach to Recognition of Degraded Word Images", Proceedings of the USPS Advanced Technology Conference, Nov. 1990, pp. 217-231.
Yang He et al., "Handwritten Word Recognition Using HMM with Adaptive Length Viterbi Algorithm", Proceedings of the Int. Conf on Acoustics, Speech and Signal Processing, San Francisco, California, Mar. 1992, vol. 3, pp. 153-156.
Douglas B. Paul et al., "Speaker Stress-Resistant Continuous Speech Recognition", Proceedings of the Int. Conf. on Acoustics, Speech and Signal Processing, 1988, pp. 283-286.
Lawrence R. Rabiner, "A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition", Proceedings of the IEEE, vol. 77, No. 2, Feb. 1989, pp. 257-285.
Lawrence R. Rabiner et al., "An Introduction to Hidden Markov Models", IEEE ASSP Magazine, Jan. 1986, pp. 4-16.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Word spotting in bitmap images using context-sensitive character does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Word spotting in bitmap images using context-sensitive character, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Word spotting in bitmap images using context-sensitive character will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-1771848

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.