DP O(n^2) + MIT OCW solution explanation


  • 26
    K

    The idea is that this is a minimax game, and if you went to MIT and took 6.046 then you would have seen something similar to this problem in class. And thanks to MIT OCW everyone can see the explanation

    The DP solution

    class Solution {
    public:
        bool PredictTheWinner(vector<int>& nums) {
            if(nums.size()% 2 == 0) return true;
            
            int n = nums.size();
            vector<vector<int>> dp(n, vector<int>(n, -1));
            
            int myBest = utill(nums, dp, 0, n-1);
            return 2*myBest >= accumulate(nums.begin(), nums.end(), 0);
        }
        
        int utill(vector<int>& v, vector<vector<int>> &dp, int i, int j){
            if(i > j) return 0;
            if(dp[i][j] != -1) return dp[i][j];
            
            int a = v[i] + min(utill(v,dp, i+1, j-1), utill(v, dp, i+2, j));
            int b = v[j] + min(utill(v,dp,i, j-2), utill(v,dp, i+1, j-1));
            dp[i][j] = max(a, b);
                            
            return dp[i][j];
        }
    };

  • 0
    S
    This post is deleted!

  • 32
    Z

    Thanks for sharing your solution, and the video in MIT OCW is really helpful.

    I try to summarize what the video tell us.

    The goal here is that we want to maximize the amount of money we can get assuming we move first.

    Here your dp[i][j] means the max value we can get if it it's our turn and only coins between i and j remain.(Inclusively)

    So dp[i][i] means there is only one coin left, we just pick it, dp[i][i+1] means there are only two left, we then pick the max one.

    Now we want to derive the more general case. dp[i][j] = max( something + v[i], something + v[j]), since we either will pick the i or j coin. The problem now turns to what "something" here will be.

    A quick idea may bedp[i][j] = max( dp[i + 1][j] + v[i], dp[i][j - 1] + v[j]), but here dp[i + 1][j] and dp[i][j - 1] are not the values directly available for us, it depends on the move that our opponent make.

    Then we assume our opponent is as good as we are and always make optimize move. The worse case is that we will get the minimal value out of all possible situation after our opponent make its move.

    so the correct dp formula would be dp[i][j] = max( min (dp[i + 1][j - 1], dp[i + 2][ j]) + v[i], min (dp[i][j - 2], dp[i + 1][ j - 1]) + v[j]}) .
    If we pick i, then our opponent need to deal with subproblem dp[i + 1][j], it either pick from i + 2 or j - 1. Similarly, If we pick j, then our opponent need to deal with subproblem dp[i][j - 1] , it either pick from i + 1 or j - 2. We take the worse case into consideration so use min() here.

    Hope this is helpful, and point out if I make any mistake :-)


  • 0
    K

    @zzwcsong
    Thanks!!! Such a great response.
    Thanks for taking the time to summarize the video for everyone


  • 2
    B

    Your MIT OCW O(n) solution does not work.

    The professor in the video only described a O(n) strategy for solving the problem when the size of the problem was even. And for good reason. If the size of the problem is odd, then the question is harder.

    An odd-size problem could be decomposed this way. Either the first player picks a[0] and second player must gain at least a[0] more than the first player in the subproblem s[1...n], or , first player picks a[n] and second player must gain at least a[n] more than the first player in the subproblem s[0..n-1]. If in either case the second player cannot achieve this then first player wins.

    Important to note is that second player, in both cases, must solve the problem of whether or not it is possible for him to earn at least k points more than the first player. This O(n) strategy to meet/break even is not sufficient. This new problem isn't quite as strict as the optimization problem that the dp algorithm solves, but I do not know of any strategy to solve this otherwise.

    [0,0,7,6,5,6,1] is an example of a test case that could fail.


  • 0
    K

    @brendon4565
    Thanks for pointing that out, have made appropriate changes


  • 3
    R

    The best explanation by far is from the video. Thanks for sharing! Here's a Python translation:

    class Solution(object):
        def PredictTheWinner(self, nums):
            def helper(v, dp, i, j):
                if i > j: 
                    return 0
                if dp[i][j] != -1:
                    return dp[i][j]
                
                a = v[i] + min(helper(v,dp,i+1, j-1), helper(v, dp, i+2,j))
                b = v[j] + min(helper(v,dp,i,j-2), helper(v,dp,i+1, j-1))
                dp[i][j] = max(a,b)
                return dp[i][j]
            
            if len(nums) % 2 == 0:
                return True
            
            n = len(nums)
            dp = [[-1] * n for _ in xrange(n)]
            myBest = helper(nums, dp, 0, n-1)
            return 2*myBest >= sum(nums)
    

    Java version:

    import java.util.stream.*;
    
    public class Solution {
        public boolean PredictTheWinner(int[] nums) {
    
            if(nums.length % 2 == 0) return true;
            
            int n = nums.length;
            int[][] dp = new int[n][n];
            for(int i = 0; i < dp.length; i++)
                Arrays.fill(dp[i], -1);
            
            int myBest = utill(nums, dp, 0, n-1);
            
            return 2*myBest >= IntStream.of(nums).sum();
        }
        
        public int utill(int[] v, int[][] dp, int i, int j){
            if(i > j) return 0;
            if(dp[i][j] != -1) return dp[i][j];
            
            int a = v[i] + Math.min(utill(v,dp, i+1, j-1), utill(v, dp, i+2, j));
            int b = v[j] + Math.min(utill(v,dp,i, j-2), utill(v,dp, i+1, j-1));
            dp[i][j] = Math.max(a, b);
                            
            return dp[i][j];
        }
    
    }
    

    Javascript:

    var PredictTheWinner = function(nums) {
        if(nums.length % 2 == 0) return true;
            
            var n = nums.length;
            
            var dp = new Array(n);
            for (var i = 0; i < n; i++) {
              dp[i] = new Array(n);
              dp[i].fill(-1)
            }
            
            
            var myBest = utill(nums, dp, 0, n-1);
            
            return 2*myBest >= nums.reduce(function(a, b){return a+b;});
    };
    
    var utill = function(v, dp, i, j){
            if(i > j) return 0;
            if(dp[i][j] != -1) return dp[i][j];
            
            var a = v[i] + Math.min(utill(v,dp, i+1, j-1), utill(v, dp, i+2, j));
            var b = v[j] + Math.min(utill(v,dp,i, j-2), utill(v,dp, i+1, j-1));
            dp[i][j] = Math.max(a, b);
                            
            return dp[i][j];
    };
    

    Golang, a little messy:

    func PredictTheWinner(nums []int) bool {
        if len(nums) % 2 == 0 { return true}
            
            n := len(nums)
            dp := make([][]int, n)
            for i, _ := range dp {
              dp[i] = make([]int, n)
              for a, _:= range dp[i] {
                  dp[i][a] = -1
              }
            }
    
            myBest := helper(nums, dp, 0, n-1)
            
            sums := 0
            for k,_ := range nums {
                sums += nums[k]
            }
            
            return 2*myBest >= sums
    }
    
    func helper (v []int, dp [][]int, i int, j int) int {
            if i > j { return 0 }
            if dp[i][j] != -1 { return dp[i][j] }
            
            a:= v[i] + helper(v,dp, i+1, j-1)
            if a > v[i] + helper(v, dp, i+2, j) {
                a = v[i] + helper(v, dp, i+2, j)
            }
            
            b:= v[j] + helper(v,dp,i, j-2)
            if b > v[j] + helper(v,dp, i+1, j-1) {
                b = v[j] + helper(v,dp, i+1, j-1)
            }
            
            dp[i][j] = a
            if b > a {
                dp[i][j] = b
            }
    
            return dp[i][j];
    }
    

  • 0
    S

    Sorry I'm confused. It is true that the total number of subproblems are O(N^2) here, but in order to make sure the complexity is right we have to make sure each subproblem is solved only constant times. How do we prove that's the case here?


  • 0
    K

    @seekerwu70
    Yes there are at most n^2 sub-problems.

    Since solving each sub-problem consists of a max of min's of four look ups it takes constant time.

    Since we cache the solution to each of the sub-problems in 2D-array dp we only solve each sub-problem once, hence the total run time is O(n^2).


  • 1
    S

    We're solving each sub problem once but we might still access each (i,j) more than constant times right?


  • 0
    K

    @seekerwu70
    But we only access each (i,j) at most 4 times.


  • 0
    S

    I think if you start with (0,10) you will access (5,5) more than 4 times. Why do you think the bound is 4 times?

    Okay maybe this is a bad example since some cases can be cached without going down the tree further. But how do we prove this constant bound?


  • 0
    K

    @seekerwu70
    just look at how we solve each sub-problem. Remove two elements, there are three ways to do this(one is used twice), we cache these answers so we never need to go down the full recursion tree again.

    Just think about DP in general if you have O( f(n) ) sub-problems and each sub-problem take O(g(n)) to solve then the total running time is O( f(n)*g(n) )

    In our case we have O(n^2) sub problems and each one take O(1) time to solve hence the total running time is O(n^2)


  • 0
    F

    Thanks a lot for the video. It helped more than any explanation I read anywhere. Cheers.


  • 0
    W

    why when nums.size() is even you directly returns True?


  • 7
    K

    @whglamrock
    Because if you look at the sum of all the even element S_even and the sum of all the odd elements S_odd, then compare the two sums to decide which elements to take, i.e you can force your opponent to either take all of the even elements or all of the odd element. Hence you can always beet your opponent.


  • 2
    K

    I wrote an iterative version of your solution. More elegant dp although doesn't change the runtime.

    class Solution {
    public:
        bool PredictTheWinner(vector<int>& nums) {
            if(nums.size()% 2 == 0) return true;
            int n = nums.size();
            vector<vector<int>> dp(n, vector<int>(n, 0));
            for(int j = 0; j < n; ++j) {
                dp[j][j] = nums[j];
                for(int i = j-1; i >= 0; --i) {
                    int a = i+1 < n && j-1 >= 0 ? dp[i+1][j-1] : 0;
                    int b = i+2 < n ? dp[i+2][j] : 0;
                    int c = j-2 >= 0 ? dp[i][j-2] : 0;
                    dp[i][j] = max(nums[i]+min(a,b), nums[j]+min(a,c));
                }
            }
            return 2*dp[0][n-1] >= accumulate(nums.begin(), nums.end(), 0);
        }
    };```

  • 0
    I

    why when it's even number of integers, always return True?

    I'm guessing, since player1 makes the 1st move, he will leave the worst case (either v(1, n - 1) or v(0, n - 2)), to player2 ,so player 2 will never win? And if there's odd number of integers, player2 will have two choices at last, so player2 will have a chance to come back?


  • 0
    L

    @zzwcsong that is helpful thx very much


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.