Image analysis – Image segmentation – Segmenting individual characters or words
Reexamination Certificate
2000-04-25
2003-11-25
Wu, Jingge (Department: 2623)
Image analysis
Image segmentation
Segmenting individual characters or words
C382S257000
Reexamination Certificate
active
06654495
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates to a method and apparatus for removing ruled lines from an image containing ruled lines and characters at high speed.
BACKGROUND OF THE INVENTION
Recently, in optical character recognition (OCR), limitations on characters and forms that can be entered in an OCR apparatus have been increasingly relaxed. For example, there is a growing need for cutting out handwritten characters filled in black frames, or for processing characters written on forms that are originally not provided for processing by an OCR apparatus. As an example, methods for removing ruled lines from an image containing ruled lines and other image components such as characters have been proposed. By way of example, “forms processing” is performed as described below.
1. Forms Processing:
There is an intense demand for processing existing business forms by OCR without making any change to them. Because, on such a form, the borders of character entry fields are printed as thin black lines and there is no clear area between those lines, a contact or an intersection between the borders and characters often occurs. To address this problem, a method is known in which a blank form is registered in advance, a filled form is superimposed on it, and matched image patterns of the both forms are removed as background images from the image. This method, however, has problems that it requires time and efforts for pre-registration of blank forms and that deleted character components cannot be interpolated after ruled line portions are removed.
As methods for solving the above-mentioned problems, the Global Interpolation Method (GIM) and a method for removing ruled lines which is disclosed in Published Unexamined Patent Application No. 9-185726 are known.
2. High-quality Segmentation of Characters Overlapping a Border by the GIM:
Naoi et al. proposes the GIM which interpolates a missing pattern by a global evaluation of geometrical information and topological structure in “Segmentation for Handwritten Characters Overlapped a Border by Global Interpolation Method” (by Naoi, Yabuki et al., IEICE Technical Report, NLC92-22, PRU93-25, pp. 33-40, July, 1993). In the GIM, missing patterns are interpolated to connect the discontinued patterns. The interpolated patterns are provided to a recognition and post-processing system as candidate patterns and the position and size of these candidate patterns are globally evaluated to correct the interpolated region. The topological structure such as connectivity in the corrected interpolated region is evaluated and the re-interpolation of the missing pattern is repeated to produce candidate patterns in sequence, which candidate patterns are provided to the recognition and post-processing system in sequence until a predetermined topological structure is obtained.
3. Method for Removing Ruled Lines Disclosed in Published Unexamined Patent Application No. 9-185726:
In this method, first, the information about the position of ruled lines in a form image in image memory is stored in ruled line position storing memory. A ruled line deleting means uses this stored ruled line position information to remove ruled lines in the form image in the image memory and stores the coordinate value indicating the position of disconnection of a character component caused at interference between the character and the ruled lines in the process of the removal of the ruled lines. Then, a character restoring means analyzes graphics structure of the form image in the vicinity of the disconnection based on ruled line position information in the ruled line position storing memory, the coordinate value indicating the disconnection position in the disconnection position storing memory, and a reference to the form image in the image memory. Finally, an interference pattern between the character and the ruled line produced at the disconnection position is estimated based on the analysis and the missing portion of the character is restored based on the estimation, thus the ruled line removal is completed.
The GIM described above provides a good interpolated image since it simulates the mechanism of human pattern interpolation by taking the product of the psychological studies into consideration. However, the method uses many processes that take a large amount of load for implementation by software, such as labeling and thinning of black pixel components and detecting outline vector directions. Therefore, the method has a problem that it cannot remove a ruled line and restore a missing portion of a character produced by intersection with the ruled line at high speed. The method disclosed in Published Unexamined Patent Application No. 9-185726 also has a problem that, like the GIM, it cannot remove ruled lines at high speed because it involves storing ruled line position information in the ruled line position memory beforehand, and during removal of ruled lines, detects a disconnection in a character component at which the character and the ruled line intersect, and provides heuristic evaluation criteria for restoring connection of disconnected positions. In addition, in this method, the types of characters are restricted to those that are constructed with a small number of strokes because the number of combinations of heuristic evaluation determinations increases as the number of disconnections increases.
SUMMARY OF THE INVENTION
It is an objective of the present invention to provide a method and apparatus for removing ruled lines at high speed and restoring a portion at which a character and a ruled line intersect each other and which is removed by the removal of the ruled line at high speed, by simply performing AND/OR operations on the entire image without specially storing disconnected position information.
The present invention relates to a method for removing ruled lines from a bit-map image containing character portions and ruled line portions, which may comprise the following steps. First, horizontal black runs in the bit-map image are detected and the positions of the detected black runs are stored as a run-length table comprising values of the horizontal start point and the length from the starting point of each black run at each vertical position. A “black run” in the present invention means a continuous sequence of black pixels running in a scanning direction. Then, based on the run-length table, black runs longer than a predetermined threshold value are removed from the image to remove the ruled line. The length of a black run from its start point may also be obtained by storing the start point and the end point, instead of the length itself. Then, in the image after the ruled lines are removed, vertically disconnected components of a character are connected. Then, a portion of the character which is deleted by the ruled line removal is extracted from the image after the connection of vertically disconnected components of the character. Finally, the extracted portion of the character is interpolated in the image from which ruled lines have been removed to restore the portion of the character removed by the ruled line removal. In the present invention, the concept of an “image after ruled line removal” includes not only the image from which ruled lines have been removed, but also the image to which certain transformation has been applied after the removal of the ruled lines. The same applies to an “image after vertical disconnected components are connected.”
The present invention also relates to an apparatus for removing ruled lines from an image containing ruled line portions and character portions at high speed. A ruled line removing apparatus according to the invention may comprise a black run detecting unit for detecting black runs of horizontal ruled lines in an image and storing the detected black runs in a run-length table comprising values of the horizontal start point and the length from the start point of each of the detected black runs at each vertical position at which the black run is detected, a line segment writing unit for removing black runs long
Katoh Shin
Takahashi Hiroyasu
Dang Thu Ann
International Business Machines - Corporation
Wu Jingge
LandOfFree
Method and apparatus for removing ruled lines does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for removing ruled lines, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for removing ruled lines will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3135159