Python, Straightforward DP with Explanation


  • 10

    Let dp(i, j) be the number of turns needed to print S[i:j+1].

    Note that whichever turn creates the final print of S[i], might as well be the first turn, and also there might as well only be one print, since any later prints on interval [i, k] could just be on [i+1, k].

    So suppose our first print was on [i, k]. We only need to consider prints where S[i] == S[k], because we could instead take our first turn by printing up to the last printed index where S[k] == S[i] to get the same result.

    Then, when trying to complete the printing of interval [i, k] (with S[i] == S[k]), the job will take the same number of turns as painting [i, k-1]. This is because it is always at least as good to print [i, k] first in one turn rather than separately.

    Also, we would need to complete [k+1, j]. So in total, our candidate answer is dp(i, k-1) + dp(k+1, j). Of course, when k == i, our candidate is 1 + dp(i+1, j): we paint S[i] in one turn, then paint the rest in dp(i+1, j) turns.

    def strangePrinter(self, S):
        memo = {}
        def dp(i, j):
            if i > j: return 0
            if (i, j) not in memo:
                ans = dp(i+1, j) + 1
                for k in xrange(i+1, j+1):
                    if S[k] == S[i]:
                        ans = min(ans, dp(i, k-1) + dp(k+1, j))
                memo[i, j] = ans
            return memo[i, j]
    
        return dp(0, len(S) - 1)
    

  • 0

    @awice
    I wonder why it's not correct for ans = min(ans, 1 + dp(i+1, k-1) + dp(k+1, j))


  • 1
    A

    Because you can have d[i, k-1] = d[i+1, k-1], but you assume that d[i, k-1] = 1+d[i+1, k-1]. Check this @awice 's argument as well:

    "Then, when trying to complete the printing of interval [i, k] (with S[i] == S[k]), the job will take the same number of turns as painting [i, k-1]. This is because it is always at least as good to print [i, k] first in one turn rather than separately."


  • 0
    B
    def strangePrinter(self, S):
        memo = {}
        def dp(i, j):
            if i > j: return 0
            if i == j: return 1
            if (i, j) not in memo:
                if S[i] == S[j]:
                    memo[i, j] = dp(i, j-1)
                else:
                    ans = j-i+1;
                    for k in xrange(i, j+1):
                        if S[k] == S[i]:
                            ans = min(ans, dp(i, k) + dp(k+1, j))
                    memo[i, j] = ans
            return memo[i, j]
    
        return dp(0, len(S) - 1)
    

    Here is a modified version of awice's solution with a proof that is hopefully easier to understand.

    algorithm idea:
    • do recursion and memoization.
    • dp[i][j] = min #prints to solve for range i to j inclusive.
    • base cases
    i > j -> 0
    i == j -> 1
    • recursive cases
    a[i] == a[j] -> dp[i][j-1]
    a[i] != a[j] -> min(dp[i][k] + dp[k+1][j]) for all k where i<=k<=j && a[i] == a[k]

    proof of
    a[i] == a[j] -> dp[i][j-1]:
    An optimal schedule exists where we first print a[i] onto the entire range [i,j].
    This is because it is always fine to print a[i] onto cell i first, but we might as well extend this print onto all other cells, since it might be more useful for those cells to start out with a[i] instead of a blank value.
    Therefore, it is fine to assume that we always first print a[i] onto the entire range [i,j].

    Because we can make this assumption, solve range [i,j] costs same as problem range[i,j-1], since satisfying cell j takes no additional prints.

    proof of
    a[i] != a[j] -> min(dp[i][k] + dp[k+1][j]) for all k where i<=k<=j && a[i] == a[k]:
    Lets pivot on the rightmost character of the first print that did not get overwritten.
    Denote the index of this character as k.

    Obviously, if cell k did not get overwritten, then a[k] == a[i]

    For given k, the structure of the schedule is to first print on the entire range [i,j], and then finish satisfying all constraints without printing on or across cell k.
    Note that the first print can also be on the range [i,k], since any character printed right of cell k will be overwritten (by definition).
    Therefore, the structure of this schedule is the same as solving range[i,k] and range[k+1,j] separately.

    Trying all candidate indexes k, and finding the minimum #prints given certain k, will lead us to the optimal answer.


  • 0
    Y

    @awice said in Python, Straightforward DP with Explanation:

    Let dp(i, j) be the number of turns needed to print S[i:j+1].

    why not Let dp(i, j) be the number of turns needed to print S[i:j]?


  • 0

    @yezilife It is more natural to work in exact intervals like [i, j]. So letting dp(i, j) = answer for S[i], S[i+1], ..., S[j] is easier for me than letting dp(i, j) = answer for S[i], S[i+1], ..., S[j-1].


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.