Dynamic Programming - Line Breaking
The Problem
Suppose you have an article which contains words , where consists of characters. The line breaking problem asks you to partition the words into lines, with each two adjacent words in the same line separated by a space, and there is no space after the last word of each line. In addition, we have a max line length , so if in your partition a line consists of , the following inequality must be satisfied:
We use to denot the left part of this inequality. The difference between the left-hand side and the right-hand side is called slack. Now we further assume you are using a fixed-width font and ignore punctucation prolems. Give an algorithm to partition the words into lines with the sum of squares of slacks of each line as small as possible.
My Solution
Remember that when solving a problem using dynamic programming, a good start is to define a set of subproblems of the original problem. We first observe that for some , must belong to the last line. This give us the subproblem . Let denote the minimum value we can obtain by optimally arranging the first words into lines, with the length of each line not greater than . For each , we try all the possible cases that may form a line for . If we use to denote the square of the slack when using words to to form a line, that is
we have the following recurrence:
The initial case is trivial, . The pesudo-code would be
1 opt[1] = S(1, 1)
2 for j = 2, 3, ..., n
3 opt[j] = INFINITE
4 for i = j, j-1, ..., 1
5 if Len(i, j) <= L
6 opt[j] = min(opt[j], S(i, j) + opt[i])
7 else
8 break
9
10 return opt[n]
A further note, since is defined in terms of , we could calculate for each and in advance in time and use it as needed. You may wonder how could we know which word belongs to which line? That’s not a problem, we can trace back the opt[i] array and calculate the line boundaries along the way using the same logic as above:
1 print-solution(n)
2 minSquareOfSlack = INIFINITE
3 for i = n, n-1, ..., 1
4 if Len(i, n) <= L
5 minSquareOfSlack = min(minSquareOfSlack, S(i, n) + opt[i])
6 else
7 break
8
9 print "word i through n form a line"
10 print-solution(i-1)