Image analysis – Image segmentation – Segmenting individual characters or words
Patent
1993-10-08
1996-10-01
Couso, Jose L.
Image analysis
Image segmentation
Segmenting individual characters or words
382187, G06K 900
Patent
active
055617204
DESCRIPTION:
BRIEF SUMMARY
BACKGROUND OF THE INVENTION
1. Field of the Invention
The invention relates to a method for extracting individual characters from raster images of a read-in character sequence, and in particular to handwritten or typed character sequences having a free pitch.
2. Description of the Related Art
In the case of automatic character recognition, in the context of raster image conditioning it is necessary, inter alia, to isolate segments which are respectively associated with an individual character from the read-in character sequence. As long as the raster images of individual characters are intrinsically cohesive and are bounded on both sides by white regions, such as white columns or white paths, the extraction of the individual characters presents no particular difficulties. However, this "ideal case" does not exist in the case of closely written handwriting and typing since, in this case, the individual characters often overlap and/or are in contact with one another, which considerably exacerbates separation of the characters, or even makes it completely impossible, because there are no longer any white columns and white paths between the letters. If the writing is only set closer together, but is written with a fixed pitch (i.e. normal typing or handwriting in small pre-printed raster boxes), so-called comb segmenting methods (as disclosed in Wissenschaftliche Berichte [Scientific Reports] A. E. G. -TELEFUNKEN, Volume 47, Number 3/4, March 1974, pages 90-99, Berlin, Dr. J. Schurman: "Bildvorbereitung fur die automatische Zeichenerkennung" [Image processing for automatic character recognition]) can often be successfully used to estimate the pitch and to find the segmenting columns. However, in principle this is not possible in the case of printed documents which are produced using typewriters having proportional spacing or composing machines, and likewise in the case of free handwriting, for which reason previous character recognition methods cannot process corresponding character strings.
The statistical method disclosed in European Patent Application 0 047 512 admittedly allows such character strings to be separated in principle, but is not of sufficient quality in the case of free handwriting.
SUMMARY OF THE INVENTION
The present invention is now based on the object of ensuring extraction of individual characters from raster images of read-in handwriting and typing, even in the case of individual characters which overlap, interlock or are in contact with one another.
The object is achieved according to the invention by starting from a left-hand separation column, an image line region is extracted from the raster image line by presetting the next reliable right-hand separation column; the image line region is converted using two-dimensional normalization into a normalized image having a fixed predetermined height and a correspondingly matched width; a separation image having a standardized width is produced from the normalized image in such a manner that inversely proportional components of the normalized image are transferred into the separation image in accordance with the ratio of the matched width to the standardized width; with the aid of a separation classifier, image-pattern-specific separation values are calculated in each case per column for the separation image, and the column having the maximum separation value is defined as the right-hand separation column; on the basis of the separation column predetermined by the separation classifier, an attempt is made, starting from the upper pixel of the separation column, to find a separation path which, in the absence of a white column, is formed within a separation region, which is located on both sides of the separation column, partially by contour tracing and, if no white path can be found by contour tracing, partially by forced separation along the separation column. In the case of the method according to the invention, an optimum separation path is defined between in each case two characters. Such a separation path is characterized by global as
REFERENCES:
patent: 4365234 (1982-12-01), Henrichon, Jr.
patent: 4449239 (1984-05-01), Bernhardt et al.
patent: 4680803 (1987-07-01), Dilella
patent: 5040229 (1991-08-01), Lee et al.
patent: 5253303 (1993-10-01), Nishijama et al.
patent: 5321768 (1994-06-01), Fenrich et al.
Pattern Recognition, "Recognition of Isolated and Simply Connected Hadwritten Numerals", 19 (1986) No. 1, pp. 1-12.
Wiss. Ber. AEG.Telefunken 47 (1974) 3/4 "Bildvorbereitung fur die automatische Zeichenerkennung", pp. 90-99 (not translated).
Lellmann Wolfgang
Muller Xaver
CGK Computer Gesellschaft Konstanz mbH
Couso Jose L.
LandOfFree
Method for extracting individual characters from raster images o does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method for extracting individual characters from raster images o, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for extracting individual characters from raster images o will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-1508264