Using shape suppression to identify areas of images that...

Image analysis – Image segmentation – Distinguishing text from other regions

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C382S200000, C382S203000

Reexamination Certificate

active

06738512

ABSTRACT:

TECHNICAL FIELD
This invention relates to optical character recognition generally, and more particularly to using shape suppression to identify areas in images that include particular shapes.
BACKGROUND OF THE INVENTION
Computer technology is continually advancing, providing computers with continually increasing capabilities. General-purpose multimedia personal computers (PCs) have now become commonplace, providing a broad range of functionality to their users, including the ability to manipulate visual images. Further advances are being made in more specialized computing devices, such as set-top boxes (STBs) that operate in conjunction with a traditional television and make specialized computer-type functionality available to the users (such as accessing the Internet).
Many such general-purpose and specialized computing devices allow for the display of visual images with text. Given the additional abilities of such devices in comparison to conventional televisions, many situations arise where it would be beneficial to be able to identify the text within a particular visual image. For example, a video image may include a Uniform Resource Locator (URL) identifying a particular web page that can be accessed via the Internet. If the URL text could be identified, then the text could be input to a web browser and the corresponding web page accessed without requiring manual input of the URL by the user.
Identification of text or characters is typically referred to as Optical Character Recognition (OCR). Various techniques are known for performing OCR. However, many OCR techniques require, or their accuracy can be improved by, identifying specific areas within a visual image that contain text prior to application of the OCR technique (that is, only the specific areas that might contain text are input to the OCR process). The accuracy of current techniques for identifying such specific areas is poor, often due to the nature of the underlying video images. Text can be “on top” of a wide range of different backgrounds and textures of the underlying video image. Distinguishing such background from text can be very difficult.
The invention described below addresses these disadvantages, providing text frame detection in video images using shape suppression.
SUMMARY OF THE INVENTION
The use of shape suppression to identify areas of images that include particular shapes is described herein. Such shapes can be, for example, letters, numbers, punctuation marks, or other symbols in any of a wide variety of languages.
According to one embodiment, a set of shape characteristics that identify the vertical edges of a set of shapes (e.g., English letters and numbers) is maintained. The vertical edges in the image are analyzed and compared to the set of shape characteristics using a Vector Quantization (VQ)-based shape classifier. Areas in which these edges match any of the shape characteristics are identified as potential areas of the image that include one or more of the set of shapes.
According to another embodiment, a vertical differential filter is applied to a received image to generate a horizontal edge map, and a non-maxima suppression filter is applied to the horizontal edge map to generate a thinned horizontal edge map. Similarly, a horizontal differential filter is applied to the received image to generate a vertical edge map and is applied to the vertical edge map to generate a thinned vertical edge map. A segmentation process then determines a set of areas that are candidates for including particular shapes (e.g., text) based on the density of edges in the areas of the vertical edge map. The portions of the horizontal edge map corresponding to the candidate areas are then analyzed to determine whether there are a sufficient number of horizontal edges in verification windows of the candidate areas. If there are a sufficient number of horizontal edges in the verification window of a candidate area, then that candidate area is output to a shape suppression filter. The vertical edges in the candidate areas are then compared to a set of shape characteristics by using a VQ-based shape classifier. For each vertical edge, if it is classified as a shape, then the edge is kept; otherwise the edge is removed. Based on the remaining edges, the probable areas are then selected and output as shape (e.g., text) areas.


REFERENCES:
patent: 4193056 (1980-03-01), Morita et al.
patent: 4817166 (1989-03-01), Gonzalez et al.
patent: 4972499 (1990-11-01), Kurosawa
patent: 6101274 (2000-08-01), Pizano et al.
patent: 6181806 (2001-01-01), Kado et al.
patent: 6529635 (2003-03-01), Corwin et al.
patent: 516576 (1992-12-01), None
Chun et al. “Automatic Text Extraction in Digital Videos using FFT and Neural Network.” Proc. FUZZ-IEEE '99, IEEE Int. Fuzzy Systems Conference, vol. 2, Aug. 1999, pp. 1112-1115.*
Hori. “A Video Text Extraction Method for Character Recognition.” Proc. ICDAR '99, 5thInt. Conf. on Document Analysis and Recognition, Sep. 1999, pp. 25-28.*
Quweider et al. “Efficient Classification and Codebook Design for CVQ.” IEE Proceedings—Vision, Image and Signal Processing, vol. 143, No. 6, Dec. 1996, pp. 344-352.*
Wu et al. “Finding Text in Images.” ACM, DL 97, 1997, pp. 3-12.*
Gargi et al. “Indexing Text Events in Digital Video Databases.” Proc. of the 14thInt. Conf. on Pattern Recognition, vol. 1, Aug. 1998, pp. 916-918.*
Shim et al. “Automatic Text Extraction from Video for Content-Based Annotation and Retrieval.” Proc. of the 14th Int. Conf. on Pattern Recognition, vol. 1, Aug. 1998, pp. 618-620.*
Jaisimha et al. “Fast Facet Edge Detection in Image Sequences Using Vector Quantization.” ICASSP-92, IEEE Int. Conf. on Acoustics, Speech and Signal Processing, vol. 3, Mar. 1992, pp. 441-444.*
Paglieroni et al. “The Position-Orientation Masking Approach to Parametric Search for Template Matching.” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 16, No. 7, Jul. 1994, pp. 740-747.*
Lienhart. “Automatic Text Recognition for Video Indexing.” ACM Multimedia 96, 1996, pp. 11-20.*
Yang et al. “New Classified Vector Quantization with Quadtree Segmentation for Image Coding.” 3rdInt. Conf. on Signal Processing, vol. 2, Oct. 1996, pp. 1051-1054.*
Zhong, et al.; “Automatic caption localizaton in compressed video”; International Conference on Image Processing (ICIP'99); vol. 2; pp. 96-100, 1999; Oct. 24-28, 1999.
Zhong, et al.; “Automatic Caption Localization in Compressed Video”; IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22; No. 4; Apr. 2000; pp. 385-392.
Huiping Li e al.,Automatic Text Detection and Tracking in Digital Videos, 40 pages, 1998.
Toshio Sato et al,Video OCR for Digital News Archives, IEEE international workshop on content-based access of image and video databases (CAIVD 1998), 9 pages.
Toshio Sato et al,Video OCR: Indexing Digital News Libraries by Recognition of Superimposed Captions, ACM Multimedia Systems Special Issue on Video Libraries, 12 pages, Feb. 1998.
Kah-Kay Sung et al,Example-based Learning for View-based Human Face Detection, Massachusetts Institute of Technology, 21 pages, 1994.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Using shape suppression to identify areas of images that... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Using shape suppression to identify areas of images that..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Using shape suppression to identify areas of images that... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3190323

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.