Image analysis – Image segmentation – Distinguishing text from other regions
Reexamination Certificate
1998-06-29
2001-05-15
Johns, Andrew W. (Department: 2621)
Image analysis
Image segmentation
Distinguishing text from other regions
Reexamination Certificate
active
06233353
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates to optical character recognition (OCR) systems, and in particular relates to a method for distinguishing, in an image being analyzed, regions containing lines of alphanumeric text characters from regions containing line drawings.
BACKGROUND OF THE INVENTION
A document recognition system is a system that takes as input a digitized image of a document and outputs a text-based digital representation of the document. The output representation captures, at least, the text in the document as recognized by the system, but may also capture the graphics and layout of that document as recognized by the system. The output produced is in a format suitable as input to some target text processing application. That target application may, for example, be a word processor, text editor, or a spreadsheet program.
A hypothetical document recognition system might typically be logically and structurally decomposed into four subsystems, a segmentation subsystem, a character recognition subsystem, a format/layout analysis subsystem, and an output subsystem.
The segmentation subsystem is usually at the front end of the system and serves to segment the image into distinct text regions and graphics regions. This invention would be integrated as part of this subsystem. This invention extends the capability of the segmentation subsystem to segregate line drawing regions from text regions.
The character recognition subsystem analyzes the imaged text rendered in a particular region of the image with the purpose to produce as output the underlying text corresponding to that imaged text. It converts the image of text to character codes for that text. The character recognition subsystem should be restricted to processing purely text regions. The character recognition subsystem should not process any of the graphical (non-text regions) in the document, it should ignore such regions. If the character recognition system were to encounter a graphics region, it would assume that the region is text and may generate spurious characters. Also, the character recognition subsystem may waste CPU time trying to nonsensically interpret graphical elements in such a region as text. By improving the ability of the segmentation subsystem to efficiently distinguish graphics regions from text regions, this invention reduces the processing load of the character recognition subsystem. It also has the potential to increase the accuracy of the output of this subsystem by eliminating the generation of any spurious characters resulting from processing a graphics region.
The format/layout analysis subsystem analyzes the output of both the segmentation and character recognition subsystems in an attempt to capture the format and layout of the document in a representation internal to the system. The layout analysis subsystem may need information on the position and location of all graphical regions so to properly construct a representation of the document. For example, such knowledge is critical in determining which text regions are captions (which would be treated specially because they do not form part of the main text flow of the document).
The document output subsystem converts this internal representation to produce an output that is suitable for a particular target document processing application. It may be desirable to “cut” graphics regions out of the document image so that they can be “pasted” as image into the text document representation that is output by the document recognition system if the output representation accommodates this. Since this invention extends the class of regions that are accurately classified as having dominantly graphical content, the invention will enhance this aspect of the output subsystem.
A document recognition system may retain graphics regions as digital image throughout the system. These regions can be embedded as image in the output that the target application receives or can be entirely dropped from the output representation.
The graphical regions of the image may be broadly classified into two types, those that are representations of continuous-tone images, such as photographs, and those that are not intrinsically continuous tone in origin, such as cartoons, drawings, maps, flowcharts, and diagrams. This disclosure refers to the graphic elements that are not continuous tone in nature as line drawings.
In this disclosure it is assumed that the image being processed is binary, meaning it has a depth of 2 and pixels can only take on values from the set {0,1} representing white and black respectively. In a binary image, continuous tone regions from a grayscale image may be rendered as halftones where the ratio of white to black pixels in a small area reflects the gray level in the grayscale rendering over that area. This technique is called dithering. Line drawings and text are intrinsically not continuous in tone, such regions from the image are ideally rendered from a grayscale image by thresholding.
DESCRIPTION OF THE PRIOR ART
In the prior art, U.S. Pat. No. 5,202,933 discloses a system for segmenting text regions from graphic regions in an image. The method provides for the use of morphological operations, preferably at a reduced scale, to eliminate vertical rules and lines from an image and then the illumination of horizontal rules and lines. The remaining portion of the image data is considered a text region. The overall morphology exploited in this patent exploits the notion that text images are generally packed tightly and that line drawings have sparse pixel distributions with thin strokes.
U.S. Pat. No. 5,335,290 discloses another system for segmenting text regions, picture regions and line drawings in a document image. Run lengths of on-pixels for each scanline from the image are used to find characteristics of text. This is a “bottom up” method of identifying text regions, in that all of the data in a particular image is tested to be text, and those portions of data which are not determined to be text are then classified as picture or line drawings.
U.S. Pat. No. 5,434,953 discloses another system for distinguishing text from line drawings in an image. First, a low-resolution consideration of the image, such as 16 pixel×16 pixel tiles, are used for sub-sampling of the various regions in the image.
U.S. Pat. No. 5,592,574 discloses a method of processing an original image, in particular an image including text, which permits manipulation of the image, such as changing its aspect ratio, while adjusting only white space between lines of text, so that the text itself is not distorted.
U.S. Pat. No. 5,680,479 discloses another system for distinguishing text regions from non-text regions in an image. Connected components are identified in the pixel image data and the text
on-text determination is made based on the size of the connected components and whether the text units are connectable to horizontally-adjacent connected components which may be other characters.
SUMMARY OF THE INVENTION
According to one aspect of the present invention, there is provided a method of analyzing data forming a two-dimensional image, comprising the step of identifying a subset of black pixels in the data as likely to belong to a line drawing region if there is not a predetermined arrangement of horizontal runs of white pixels above and below the subset of black pixels.
According to another aspect of the present invention, this step of identifying pixels as likely to belong to a line drawing region includes identifying the subset of data as likely to belong to a line drawing region if W/H is not within a predetermined range, wherein the data is characterized by a first run of white pixels overlapping a second run of white pixels by a minimum length W, the first run of white pixels being spaced from the second run of white pixels by a vertical separation H.
According to another aspect of the present invention, there is provided a method of analyzing full-resolution data forming a two-dimensional image. Low-resolution data is derived from the f
Azaria Seyed H.
Hutter R.
Johns Andrew W.
Xerox Corporation
LandOfFree
System for segmenting line drawings from text within a... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with System for segmenting line drawings from text within a..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System for segmenting line drawings from text within a... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2436692