Image analysis – Image segmentation – Distinguishing text from other regions
Reexamination Certificate
2000-01-19
2004-08-03
Wu, Jingge (Department: 2623)
Image analysis
Image segmentation
Distinguishing text from other regions
C382S164000
Reexamination Certificate
active
06771816
ABSTRACT:
BACKGROUND OF THE INVENTION
The present invention relates to computer-implemented methods and apparatus for displaying text images.
A text document, such as one on a piece of paper, can be converted into a digital representation by digitization. A digital representation of a document can be divided into lexical units such as characters or words and each unit can be represented in a coded or noncoded representation.
A coded representation of text is character based; that is, it is a representation in which the text is represented as recognized characters. The characters are typically represented by character codes, such as codes defined by the ASCII or Unicode Standard character encoding standards, but may also be represented by character names. The universe of characters in any particular context can include, for example, letters, numerals, phonetic symbols, ideographs, punctuation marks, diacritics, mathematical symbols, technical symbols, arrows, dingbats, and so on. A character is an abstract entity that has no inherent appearance. How a character is represented visually—e.g., as a glyph on a screen or a piece of paper—is generally defined by a font defining a particular typeface. In digital or computer-based typography applications, a digital font, such as any of the PostScript™ fonts available from Adobe Systems Incorporated of San Jose, Calif., generally includes instructions (commonly read and interpreted by rendering programs executing on computer processors) for rendering characters in a particular typeface. A coded representation can also be referred to as a character-based representation.
A noncoded representation of text is a primitive representation in which the text is represented as an image, not as recognized characters. A noncoded representation of text may include an array of picture elements (“pixels”), such as a bitmap. In a bitmap, each pixel is represented by one binary digit or bit in a raster. A pixel map (or “pixmap”) is a raster representation in which each pixel is represented by more than one bit.
Digitization of an image generally results in a primitive representation, typically a bitmap or pixel map. If the image contains text, the primitive representation can be interpreted and converted to a higher-level coded format such as ASCII through use of an optical character recognition (OCR) program. A confidence-based recognition system—such as the one described in commonly-owned U.S. Pat. No. 5,729,637 (the '637 patent), which is incorporated by reference herein—processes an image of text, recognizes portions of the image as characters, words and/or other units, and generates coded representations for any recognized units in the image. Some units may be recognized only with a low level of confidence or not recognized at all. When the image is displayed, low-confidence units may be displayed in their original image form, while those recognized with sufficiently high confidence are displayed as rendered bitmaps derived from their coded representations.
A digital representation of an image including both coded and noncoded units can be displayed on a raster output device such as a computer display or printer. This type of display, i.e., one containing both portions of the original bitmap or pixel map and rendered bitmaps, will be referred to as a hybrid display. The coded units are rendered (i.e., rasterized), which may be accomplished in a variety of ways, such as by retrieving an output bitmap stored in memory for a code or by computing an output bitmap according to a vector description associated with a code. The result will be referred to as rasterized text. The noncoded units are displayed in their original image form, which will be referred to as a text pixmap. Typically, whole words are represented either as rasterized text or as a text pixmap for display on raster output devices.
Text pixmaps often exhibit color variation effects that result from improper color registration during digitization. This color variation may appear as “edge effects”—fringes or halos around the edges of characters when a text pixmap is displayed—as shown in
FIG. 3A
, and may not be aesthetically pleasing. Such color variation effects may be especially noticeable when a text pixmap is displayed with rasterized text, which typically does not exhibit such effects.
Text pixmaps may also exhibit a “ghosting” effect when displayed, resulting from the local background on which a text pixmap is typically displayed, as shown in FIG.
4
A. When a text pixmap is to be displayed against a global background, such as where the pixmap is to be displayed with rasterized text in a hybrid display, this local background may not match the global background of the hybrid display on which the text pixmap and rasterized text is to be rendered. For example, the global background of a hybrid display may be assigned a color that is uniform over the entire global background. By contrast, the color of the local background of the text pixmap may vary over the pixmap. When the text pixmap is displayed against the global background of the hybrid display, ghosting may appear as a result of the color mismatch between the local background of the pixmap and the global background. As shown in
FIG. 4A
, this ghosting effect can be quite noticeable and can be aesthetically unpleasant.
SUMMARY OF THE INVENTION
In general, in one aspect, the invention provides a method and apparatus, including computer program apparatus, implementing techniques for processing an image including one or more lexical units each representing a unit of text, in which each lexical unit is defined by a number of image pixels and a number of background pixels. The techniques include generating at least one text mask distinguishing between at least the text pixels and the background pixels of at least one lexical unit in the image and storing the text mask in an electronic document representing the image.
Implementations of the invention include one or more of the following features. The text mask can include a raster representation of the pixels of at least one lexical unit, including a first set of pixels corresponding to the text pixels of the lexical unit and a second set of pixels corresponding to at least the background pixels of the lexical unit. Generating the text mask can include assigning the first set of pixels in the raster representation a first pixel value and assigning the second set of pixels in the raster representation a second pixel value. Generating the text mask can include reversing the pixel value of each pixel in the raster representation. Generating the text mask can include assigning an intermediate pixel value to pixels in the raster representation at a boundary between the first and second set of pixels in the raster representation. The text mask can include a vector representation of the text. If the image includes more than one page, a separate text mask may be stored for each page of the image. If the image includes lexical units representing units of text in more than one color, a separate text mask may be stored for the lexical units representing units of text in each color. If the image includes more than one lexical unit, a separate text mask may be stored for each lexical unit. The text mask can be stored in a hybrid data structure. The electronic document can include a page description language representation of at least one lexical unit in the image. Generating the text mask can include identifying the text pixels in an OCR process.
In general, in another aspect, the invention provides a method and apparatus, including computer program apparatus, implementing techniques for processing an image including one or more lexical units each representing a unit of text, each lexical unit being defined by a plurality of image pixels and a plurality of background pixels. The techniques include generating at least one text mask distinguishing between at least the text pixels and the background pixels of at least one lexical unit in the image and using the text mask to generate a representation of th
Adobe Systems Incorporated
Fish & Richardson P.C.
Wu Jingge
LandOfFree
Generating a text mask for representing text pixels does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Generating a text mask for representing text pixels, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Generating a text mask for representing text pixels will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3273396