Incremental algorithms for optimal linebreaking in text layout

Data processing: presentation processing of document – operator i – Presentation processing of document – Layout

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C715S252000, C715S252000

Reexamination Certificate

active

06647533

ABSTRACT:

This invention relates to document processing systems, and in particular, to computer-implemented methods for updating breakpoints in a paragraph following a change to the paragraph.
BACKGROUND
When we type text into a word processor, we take it for granted that at some point, when the insertion point becomes too close to the right margin, the line will break and any additional text we type will appear on the following line. This typically results in one or more paragraphs, each of which is broken into one or more lines defined by breakpoints marking the beginning and ending of each line. One common problem that arises in word processors and other document processing systems is determining where in the paragraph to place these breakpoints.
Conventional line breaking algorithms, often referred to as “greedy algorithms,” seek to pack as much text as possible into a line without having the line exceed the maximum line length. Because of their relative simplicity, these algorithms have the advantage of being easy to implement and quick to execute. They are therefore particularly adapted to real-time editing with conventional interactive word processors, in which prompt response to user input is essential.
A disadvantage of such greedy algorithms, however, is that they optimize only one line at a time. These algorithms are generally incapable of considering how the position of a breakpoint on a particular line may affect the lengths of all other lines in the document (hence the term “greedy”). For example, it may be the case that by placing slightly less text on one line, many other lines can be made to more closely approach the desired line length. A greedy algorithm, because it optimizes only the current line, does not recognize this. As a result, the breakpoints selected by a greedy algorithm can result in paragraphs having a decidedly unattractive appearance, particularly when the paragraphs are to be set in narrow columns. For example, the paragraph may have lines that deviate significantly from the desired line lengths. In some cases, the last line of a paragraph may have only a single word. In the case of justified paragraphs, certain lines may have excessively large gaps between words.
Because of its ability to store the entire paragraph in memory, a computer need not be confined to considering only one line at a time. The presence of the entire paragraph in memory can, in principle, enable the computer to examine the entire paragraph before committing to a particular set of linebreaks.
As a result, the computer should, in principle, be able to consider the effect of a line break on the overall appearance of the paragraph. In some cases, a computer might refrain from packing as many words as possible into a line in order to generate a more aesthetically pleasing paragraph. For example, a short preposition might easily fit on the first line of a paragraph. However, this might result in the last line of the paragraph having only one word.
This ability to consider the entire paragraph at once before committing to any linebreaks gives rise to a second class of algorithms in which breakpoints are selected on the basis of a global parameter that measures the effect of the entire set of breakpoints on the paragraph as a whole. The leading algorithm of this type, which is referred to hereafter as the KP algorithm, is described in Knuth and Plass, “Breaking Paragraphs into Lines,” Software-Practice and Experience, 11:1119-1184 (1981), the teachings of which are herein incorporated by reference.
Throughout the specification, it will be necessary to refer to directions and locations within a paragraph. In keeping with the preferred direction for reading and writing English text, a paragraph is considered to begin at its left-most end and to end at its right-most end. The upstream direction is the direction towards the beginning of the paragraph; and the downstream direction is the direction towards the end of the paragraph. The adjectives “earlier” and “later” are used in the specification to refer to the locations that are upstream or downstream from other locations respectively. The method of the invention, however, does not depend on these definitions and can be applied to a paragraph that begins at its rightmost end and ends at its leftmost end.
Referring to
FIG. 1
, the KP algorithm considers a paragraph
10
to have a set of legal breakpoints
12
, each of which has a cost
14
associated with it. Some of these legal breakpoints are “feasible-breakpoints.” A feasible breakpoint is a legal breakpoint that results in a line having a length that is within a predefined tolerance of a target line length. From this set of legal breakpoints
12
, the KP algorithm selects a first set of feasible breakpoints
16
for the first line, as shown in FIG.
2
.
Each feasible breakpoint in this set of feasible breakpoints for the first line generates a set of feasible breakpoints for the second line. For example, if the first line breaks at b
1
, the second line can break at either b
3
or b
4
. A break earlier than b
3
will result in a second line that is too short relative to a desired line length, whereas a break later than b
4
will result in a second line that is too long relative to the desired line length. If, on the other hand, the first line breaks at b
2
instead of at b
1
, breaking the second line at b
3
results in a second line that is too short. A second line break following b
4
results in a second line that is too long. Hence, the only feasible break for the second line is at b
4
.
It is readily apparent that the above procedure continues until the end of the paragraph, with the feasible breakpoints for any line being determined by the selected feasible breakpoint for the immediately preceding line. Each feasible breakpoint for the first line thus generates a finite set of feasible breakpoint sequences for all subsequent lines. By adding the costs associated with each feasible breakpoint in a sequence of breakpoints, one can obtain a cumulative cost, for each of these feasible breakpoint sequences. The KP algorithm is an efficient way to select the feasible breakpoint sequence having the lowest such cost.
Although the KP algorithm is efficient, it is apparent that if one were to change a paragraph, for example by inserting or deleting text, the algorithm would be forced to regenerate all the feasible breakpoint sequences in the resulting changed paragraph. As a result, application of the KP algorithm to real-time editing, as it is commonly performed in modern word processors, results in undesirable delays caused by the need to re-evaluate all possible breaks in the changed paragraph following each insertion or deletion.
It is thus desirable in the art to provide a globally optimizing linebreaking algorithm that meets the stringent performance standards associated with real-time editing in a modern word processor.
SUMMARY
The method of the invention presupposes that a particular paragraph has been operated upon by a dynamic programming line breaking algorithm such as the KP algorithm described above. As shown in
FIGS. 3 and 4
, the operation of the KP algorithm generates a network of connecting arcs
18
from one feasible breakpoint
16
to the next, as well as costs associated with each arc. This network of connecting arcs, and their associated costs, will be collectively referred to as “auxiliary information.”
Although the ordinary meaning of a “paragraph” is that of a distinct division of a written work, usually marked by beginning on a new and indented line, as used herein, a “paragraph” refers to an ordered sequence of items which is to be divided into an ordered set of subsequences called “lines.” Hence, by this definition, an entire book can be considered as a single paragraph. The items themselves need not be alphanumeric characters. For example, the items might be frames in a comic strip or musical symbols in a composition. The method of the invention is sufficiently general to apply to any ordered sequence of items that is to be divided into a similarly ordered set

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Incremental algorithms for optimal linebreaking in text layout does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Incremental algorithms for optimal linebreaking in text layout, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Incremental algorithms for optimal linebreaking in text layout will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3171531

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.