Multiple size reductions for image segmentation

Image analysis – Image segmentation

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C382S298000

Reexamination Certificate

active

06532302

ABSTRACT:

TECHNICAL FIELD
The present invention relates generally to image segmentation, and more particularly, to a method and system for image segmentation through multiple reductions of the size of an image.
BACKGROUND ART
In general, segmentation is the first step in the process of image recognition. Segmentation may be defined as the identification and separation of clusters of mutually close objects, that is, objects that are closer to each other than to any external object. The goal of segmentation is to extract target objects from the separated clusters that are characterized by such parameters as size, shape, granularity, texture, intensity of color, and location.
An aerial photograph, for example, may be segmented by identifying various target objects, i.e. landmarks, with different shapes and textures, such as fields, roads, buildings, bodies of water, and the like. Thereafter, the segmented objects may be extracted and compared with a database of such objects in order to identify the geographical location of the scene in the photograph.
Similarly, the process of segmentation is generally the first step in optical character recognition (OCR), in which a document is electronically scanned and converted into a form that can be easily manipulated by, for example, a word processor. Many documents, however, are complex, including two or more columns of text, as well as photographs, diagrams, charts, and other objects. Therefore, such documents are initially segmented in order to extract blocks of text for analysis.
In the OCR context, segmentation is often referred to as “line extraction” because it typically involves segmenting the document into a plurality of lines. Generally, lines are the basic unit of extraction because they indicate the flow of the text. In a multi-column document, for example, it is obvious why a knowledge of the line layout is essential to correctly interpreting the meaning of the text. Moreover, in recognizing a word or character, a knowledge the surrounding words and characters in a line permits the use of contextual and geometric analysis in resolving ambiguities.
Conventionally, segmentation is performed using a “bottom up” or “connected component” approach. This method involves decomposing the image into basic entities (connected components) and aggregating those entities according to some rule. For example, in a page of text, a single character is generally the most basic connected component. During segmentation, a character is identified and assigned a minimum bounding rectangle (MBR), which is defined as the smallest rectangle that completely contains a discrete pattern of a connected component. Thereafter, all of the MBRs within a certain distance from each other are aggregated. If the correct distance is chosen, the aggregated MBRs will form horizontal connected components representing lines of text, which may then be extracted for analysis.
Segmentation is performed automatically and almost instantly by the human brain. For example, when a person looks at a document, he or she can easily identify the text portions among a variety of other objects. However, as currently implemented, conventional methods and systems for image segmentation are slow and inefficient. This is particularly true with respect to segmenting complex documents including, for example, more than one column of text, halftone regions, graphics, and handwritten annotations.
Conventional approaches are time consuming because they must decompose the sample image, identify each of the individual connected components, calculate the distances between the components, and aggregate those components within a certain distance from each other. For complex documents, this process can result in a large number of calculations, and accounts for a significant portion of the overall processing time in image recognition. What is needed, then, is a segmentation method and system that is significantly faster than conventional approaches.
DISCLOSURE OF INVENTION
The present invention offers a more efficient, holistic approach to image segmentation. Briefly, the present invention recognizes the fact that components of a document, when viewed from a distance, tend to solidify and aggregate. For instance, if a person stands at a distance from a printed page, the lines of text appear to blur and, for practical purposes, become solid lines. This effect can be simulated on a computer by reducing the size or resolution of a scanned image. For example, as shown in
FIG. 1
, several characters on a line become a single connected component at a reduction of 1:4.
By exploiting this effect, a more efficient and substantially faster method for image segmentation is realized. According to the present invention, a size reduction unit (
134
) reduces the size of a sample image (
144
), and, at the same time, fills small gaps between foreground pixels. As noted above, size reduction tends to solidify clusters of connected components separated by narrow gaps. Thereafter, a connected component analyzer (
136
) identifies connected components and their associated minimum bounding rectangles in the reduced image (
145
). Next, a target object filter (
138
) searches the connected components for target objects, making use of a target object library (
146
) to identify target objects characterized by such parameters as size, shape, and texture. Finally, an inverse mapper (
140
) locates the bounding rectangles of the target objects in the original sample image (
144
), and extracts the associated portions of the image (
144
) for analysis in a conventional image classifier (
142
).


REFERENCES:
patent: 5202933 (1993-04-01), Bloomberg
patent: 5434953 (1995-07-01), Bloomberg
patent: 5680479 (1997-10-01), Wang et al.
patent: 5778092 (1998-07-01), MacLeod et al.
patent: 5809167 (1998-09-01), Al-Hussein
patent: 5848185 (1998-12-01), Koga et al.
patent: 5903904 (1999-05-01), Peairs
patent: 0 657 838 (1995-06-01), None
Deforges, O., et al., “A Fast Mutiresolution Text-Line and Non Text-Line Structures Extraction and Discrimination Scheme for Document Image Analysis”,Proceedings of the International Conference on Image Processing (ICIP), Nov. 13, 1994, Los Alamitos, California , U.S.A.
Fletcher, L. A., et al., “A Robust Algorithm For Text String Separation From Mixed Text/Graphics Images”,IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 10, No. 6, Nov. 1988, New York, New York, U.S.A.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Multiple size reductions for image segmentation does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Multiple size reductions for image segmentation, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Multiple size reductions for image segmentation will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3050434

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.