Image analysis – Image segmentation
Reexamination Certificate
2002-01-25
2003-12-09
Johns, Andrew W. (Department: 2621)
Image analysis
Image segmentation
C382S217000, C382S177000, C382S180000, C707S793000, C707S793000
Reexamination Certificate
active
06661919
ABSTRACT:
BACKGROUND OF THE INVENTION
The present invention relates generally to the display of digitally stored and/or processed images, and more particularly to a method and apparatus for displaying images on raster display devices such as laser printers and computer monitors.
Digital images can be efficiently stored, edited, printed, reproduced, and otherwise manipulated. It is therefore often desirable to convert an image, such as on a piece of paper, into a digital representation of the image by a process known as digitization. Digital representations of an image can be primitive and non-coded (e.g., an array of picture elements or “pixels”) or may contain higher level descriptive coded information (e.g., ASCII character codes) from which a primitive representation may be generated. Generally, high level coded digital representations are more compact than primitive non-coded ones.
Optical character recognition (OCR) encompasses digitization and a method for transforming text in bitmap representation to a high level coded representation, such as ASCII character codes. In OCR digitization, text characters on a printed surface such as a sheet of paper are typically scanned by an optical scanner, which creates a bitmap of the pixels of the image. A pixel is a fundamental picture element of an image, and a bitmap is a data structure including information concerning each pixel of the image. Bitmaps, if they contain more than on/off information, are often referred to as “pixel maps.”
Other types of processes can also digitize real-world images. Devices such as digital cameras can be used to directly create bitmaps corresponding to a captured image. A computer system can recreate the image from the bitmap and display it on a computer display or send the bitmap to a printer to be printed. Bitmap generators can be used to convert other types of image-related inputs into bitmaps which can be manipulated and displayed. Incoming facsimile (fax) data includes low-resolution bitmaps that can be manipulated, recognized, printed, etc.
Once a bitmap is input to a computer, the computer can perform recognition on the bitmap so that each portion or object of the input bitmap, such as a character or other lexical unit of text, is recognized and converted into a code in a desired format. The recognized characters or other objects can then be displayed, edited, or otherwise manipulated using an application software program running on the computer.
There are several ways to display a recognized, coded object. A raster output device, such as a laser printer or computer monitor, typically requires a bitmap of the coded object which can be inserted into a pixel map for display on a printer or display screen. A raster output device creates an image by displaying an array of pixels arranged in rows and columns from the pixel map. One way to provide the bitmap of the coded object is to store an output bitmap in memory for each possible code. For example, for codes that represent characters in fonts, a bitmap can be associated with each character in the font and for each size of the font that might be needed. The character codes and font size are used to access the bitmaps. However, this method is very inefficient in that it tends to require a large amount of peripheral and main storage. Another method is to use a “character outline” associated with each character code and to render a bitmap of a character from the character outline and other character information, such as size. The character outline can specify the shape of the character and requires much less memory storage space than the multitude of bitmaps representing many sizes. A commonly-used language to render bitmaps from character outlines is the PostScript® language by Adobe Systems, Inc. of Mountain View, Calif. Character outlines can be described in standard formats, such as the Type 1® format by Adobe Systems, Inc.
OCR processes are limited by, among other things, the accuracy of the digitized image provided to the computer system. The digitizing device (such as a scanner) may distort or add noise to the bitmap that it creates. In addition, OCR processes do not perfectly recognize bitmap images, particularly if they are of low resolution or are otherwise of low quality. For example, a recognizer might misread ambiguous characters, characters that are spaced too closely together, or characters of a font for which it had no information.
Imperfect recognition can present problems both at the time of editing a recognized image and when printing or displaying the image. Misrecognized images may be printed incorrectly, and images that are not recognized at all may not be printed at all, or may be printed as some arbitrary error image. This reduces the value of the OCR process, since the recognized document may require substantial editing.
SUMMARY OF THE INVENTION
The present invention provides a method and apparatus for creating a hybrid data structure describing recognized and unrecognized objects. The invention is applicable to recognizing text or other objects from a bitmap provided by an optical scanner or other bitmap generator. Objects that are not recognized by the recognizer are stored and displayed using a portion of the original bitmap so that an apparently perfect recognized document is displayed.
The apparatus of the present invention includes a system for producing a raster image derived from a hybrid data structure including coded and non-coded portions from an input bitmap. The system includes a data processing apparatus and a recognizer for performing recognition on an input bitmap to detect identifiable objects within the bitmap. The system creates a hybrid data structure including coded portions derived from the identifiable objects. The hybrid data structure also includes non-coded portions derived from portions of the bitmap which do not correspond to the identifiable objects (non-identifiable objects). Finally, an output device, such as a printer, a plotter, or a computer display, develops a visually perceptible raster image derived from the hybrid data structure. The raster image includes newly-rendered raster images of the identifiable objects and scaled raster images of the non-identifiable objects. An input device, such as an optical scanner, a digital camera, and a bitmap generator, can be included to provide the input bitmap to the data processing apparatus.
The system preferably performs geometric correction to the input bitmap, which includes creating a distortion map of the bitmap and creating a layout correction transform from the distortion map and the bitmap. The identifiable objects of the hybrid data structure preferably include codes for recognized lexical units such as characters and words comprising the characters. The non-identifiable objects preferably correspond to unrecognized words which fall below a recognition threshold confidence level. Non-coded data is added to the hybrid data structure for the non-identifiable objects. The recognizer compares each of the identifiable objects with the portion of the input bitmap corresponding to the identifiable object to make size adjustments to the identifiable object if appropriate. The system preferably measures font attributes of the lexical units and assigns a typeface to each of the lexical units.
The present invention further includes a method for producing a hybrid data structure from a bitmap of an image. The bitmap includes identifiable objects and non-identifiable objects. The method, implemented on a digital processor, inputs a signal including a bitmap of an image and partitions the bitmap into a hierarchical structure of lexical units. Labels are assigned to a label list for each lexical unit of a predetermined hierarchical level, where each label in the label list has an associated confidence level. If a label in the label list for a lexical unit has a confidence level greater than a threshold confidence level, then that lexical unit is considered identifiable and is stored in a hybrid data structure as coded data. If no label in the lexical unit's l
King James C.
Nicholson Dennis G.
Adobe Systems Incorporated
Alavi Amir
Johns Andrew W.
LandOfFree
Method and apparatus for producing a hybrid data structure... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for producing a hybrid data structure..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for producing a hybrid data structure... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3129549