Relabelling of tokenized symbols in fontless structured...

Facsimile and static presentation processing – Static presentation processing – Detail of medium positioning

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

06529285

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to structured document representations and, more particularly, relates to structured document representations suitable for rendering into printable or displayable document raster images, such as bit-mapped binary images or other binary pixel or raster images. The invention further relates to data compression techniques suitable for document image rendering and transmission.
BACKGROUND OF THE INVENTION
Structured Document Representations
Structured document representations provide digital representations for documents that are organized at a higher, more abstract level than merely an array of pixels. As a simple example, if this page of text is represented in the memory of a computer or in a persistent storage medium such as a hard disk, CD-ROM, or the like as a bitmap, that is, as an array of 1s and 0s indicating black and white pixels, such a representation is considered to be an unstructured representation of the page. In contrast, if the page of text is represented by an ordered set of numeric codes, each code representing one character of text, such a representation is considered to have a modest degree of structure. If the page of text is represented by a set of expressions expressed in a page description language, so as to include information about the appropriate font for the text characters, the positions of the characters on the page, the sizes of the page margins, and so forth, such a representation is a structured representation with a great deal of structure.
Known structured document representation techniques pose a tradeoff between the speed with which a document can be rendered and the expressiveness or subtlety with which it can be represented. This is shown schematically in
FIG. 1
(PRIOR ART). As one looks from left to right along the continuum
1
illustrated
FIG. 1
, the expressiveness of the representations increases, but the rendering speed decreases. Thus, ASCII (American Standard Code for Information Interchange), a purely textual representation without formatting information, renders quickly but lacks formatting information or other information about document structure, and is shown to the left of FIG.
1
. Page description languages (PDLs), such as PostScript® (Adobe Systems, Inc., Mountain View, Calif.; Internet: http://www.adobe.com) and Interpress (Xerox Corporation, Stamford, Conn.; Internet: http://www.xerox.com), include a great deal of information about document structure, but require significantly more time to render than purely textual representations, and are shown to the right of continuum
1
.
Continuum
1
can be seen as one of document representations having increasing degrees of document structure:
At the left end of continuum
1
are purely textual representations, such as ASCII. These convey only the characters of a textual document, with no information as to font, layout, or other page description information, much less any graphical, pictorial (e.g., photographic) or other information beyond text.
Also near the left end of continuum
1
is HTML (HyperText Markup Language), which is used to represent documents for the Internet's World Wide Web. HTML provides somewhat more flexibility than ASCII, in that it supports embedded graphics, images, audio and video recordings, and hypertext linking capabilities. However, HTML, too, lacks font and layout (i.e., actual document appearance) information. That is, an HTML document can be rendered (converted to a displayable or printable output) in different yet equally “correct” ways by different Web client (“browser”) programs or different computers, or even by the same Web client program running on the same computer at different times. For example, in many Web client programs, the line width of the rendered HTML document varies with the dimensions of the display window that the user has selected. Increase the window size, and line width increases accordingly. The HTML document does not, and cannot, specify the line width. HTML, then, does allow markup of the structure of the document, but not markup of the layout of the document. One can specify, for example, that a block of text is to be a first-level heading, but one cannot specify exactly the font, justification, or other attributes with which that first-level heading will be rendered. (Information on HTML is available on the Internet from the World Wide Web Consortium at http://www.w3.org/pub/WWW/MarkUp/.)
At the right end of continuum
1
are page description languages, such as PostScript and Interpress. These PDLs are full-featured programming languages that permit arbitrarily complex constructs for page layout, graphics, and other document attributes to be expressed in symbolic form.
In the middle of continuum
1
are printer control languages, such as PCL5 (Hewlett-Packard, Palo Alto, Calif.; Internet: http://www.hp.com/), which includes primitives for curve and character drawing.
Also in the middle of continuum 1, but somewhat closer to the PDLs, are cross-platform document exchange formats. These include Portable Document Format (Adobe Systems, Inc.) and Common Ground (Common Ground Software, Belmont, Calif.; Internet: http://www.commonground.com/). Portable Document Format, or PDF, can be used in conjunction with a software program called Adobe Acrobat™. PDF includes a rich set of drawing and rendering operations invocable by any given primitive (available primitives include “draw,” “fill,” “clip,” “text,” etc.), but does not include programming language constructs that would, for example, allow the specification of compositions of primitives.
Known structured document representation techniques assume that the rendering engine (e.g., display driver software, printer PDL decomposition software, or other software or hardware for generating a pixel image from the structured document representation) have access to a set of character fonts. Thus a document represented in a PDL can, for example, have text that is to be printed in 12-point Times New Roman font with 18-point Arial Bold headers and footnotes in 10-point Courier. The rendering engine is presumed to have the requisite fonts already stored and available for use. That is, the document itself typically does not supply the font information. Therefore, if the rendering engine is called upon to render a document for which it does not have the necessary font or fonts available, the rendering engine will be unable to produce an authentic rendering of the document. For example, the rendering engine may substitute alternate fonts in lieu of those specified in the structured document representation, or, worse yet, may fail to render anything at all for those passages of the document for which fonts are unavailable.
The fundamental importance of fonts to PDLs is illustrated, for example, by the extensive discussion of fonts in the Adobe Systems, Inc.
PostScript Language Reference Manual
(2d ed. 1990) (hereinafter
PostScript Manual
). At page 266, the
PostScript Manual
says that a required entry in all base fonts, encoding, is an “[a]rray of names that maps character codes (integers) to character names—the values in the array.” Later, in Appendix E (pages 591-606), the
PostScript Manual
gives several examples of fonts and encoding vectors.
A notion basic to a font is that of labeling, or the semantic significance given to a particular character or symbol. Each character or symbol of a font has an unique associated semantic label. Labeling makes font substitution possible: Characters from different fonts having the same semantic label can be substituted for one another. For example, each of the characters
21
,
22
,
23
,
24
,
25
,
26
in
FIG. 2
(PRIOR ART) has the same semantic significance: Each represents the upper-case form of “E,” the fifth letter of the alphabet commonly used in English. However, each appears in a different font. It is apparent from the example of
FIG. 2
that font substitution, even if performed for only a single character, can dramatically alter the appearance of the rendered image o

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Relabelling of tokenized symbols in fontless structured... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Relabelling of tokenized symbols in fontless structured..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Relabelling of tokenized symbols in fontless structured... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3081949

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.