Why DP is slower than DFS even with no memoization?


  • 2

    My first accepted solution was this:

    class Solution {
    public:
        bool canPartition(vector<int>& nums) {
            int all = accumulate(nums.begin(), nums.end(), 0);
            if (all % 2 != 0)
                return false;
            unordered_set<int> sums;
            sums.insert(0);
            for (int num : nums) {
                unordered_set<int> nextSums = sums;
                for (int sum : sums) {
                    if (sum + num == all - (sum + num))
                        return true;
                    else if (sum + num < all - (sum + num))
                        nextSums.insert(sum + num);
                }
                sums = nextSums;
            }
            return false;
        }
    };
    

    It barely gets past TL: 1836 ms.
    Next, a DP solution based on this one:

    class Solution {
    public:
        bool canPartition(vector<int>& nums) {
            int sum = accumulate(nums.begin(), nums.end(), 0);
            if (sum % 2 != 0)
                return false;
            int target = sum / 2;
            vector<bool> hasSum(target + 1, false);
            hasSum[0] = true;
            int len = 1;
            for (int num : nums) {
                if (num > target) // A single number greater than half the sum?
                    return false; // No way we can partition this!
                for (int i = len - 1; i >= 0; --i) {
                    if (hasSum[i] && i + num < hasSum.size()) {
                        if (i + num == target)
                            return true;
                        hasSum[i + num] = true;
                        len = max(len, i + sum + 1);
                    }
                }
            }
            return hasSum[target];
        }
    };
    

    43 ms! Much better. But simple naive DFS based on this solution

    class Solution {
    public:
        bool canPartition(vector<int>& nums) {
            int sum = accumulate(nums.begin(), nums.end(), 0);
            if (sum % 2 != 0)
                return false;
            return hasSubsetSum(nums, 0, sum / 2);
        }
    private:
        bool hasSubsetSum(const vector<int> &nums, size_t start, int target) {
            if (target < 0)
                return false;
            if (target == 0)
                return true;
            for (size_t i = start; i < nums.size(); ++i)
                if (hasSubsetSum(nums, i + 1, target - nums[i]))
                    return true;
            return false;
        }
    };
    

    is 3 ms! How come? Isn't it exponential? Is it just the test cases are picked or something more?


  • 1
    R

    The N in DFS is the length of the input array, while in DP it is the half of the summation of all elements in the input array. It might be possible that the length of the input array is short, however, the summation is pretty large, in which case DP performs worse than DFS.


  • 0

    @rivulet.zhang Could be. But then again: both size and elements have the same bound in the problem description (100). That means the maximum possible sum is O(N^2). That makes DP O(N^3), and backtracking is still O(2^N). Even when N is around 20 they should be on about the same level, for larger N backtracking should be prohibitively slow! And there are test cases as large as 50 elements.


  • 0
    M

    because of stupid test case
    dp should be much better than dfs


  • 1

    @SergeyTachenov said in Why DP is slower than DFS even with no memoization?:

    is 3 ms! How come? Isn't it exponential? Is it just the test cases are picked or something more?

    We've added more test cases. Now your naive DFS approach should get TLE now.


  • 0

    Your DP solution can not pass the OJ. = =
    I think you might need correct it.


  • 0

    @zyoppy008 It passes for me. Just resubmitted it. It could be improved a little bit by using sums.swap(nextSus) instead of sums = nextSums, but it passes either way. It's just very slow.


  • 0

    @SergeyTachenov
    I mean the DP solution, your second solution which does not have sums = nextSums
    I just tried it. It failed.


  • 0

    @zyoppy008 Oh, that DP. You're right! It fails one of the new tests. Fixed that now.


  • 0
    R
    int cmpfunc(void *a, void *b) {
        return ((*(int *)a) - (*(int *)b));
    }
    
    bool checkSum(int *nums, int numsSize, int target) {
        if (target == 0) {
            return (true);
        } else if (target < 0) {
            return (false);
        }
    
        if (numsSize == 0) {
            return (false);
        }
    
        return (checkSum(nums, numsSize-1, target-nums[numsSize-1]) ||
                checkSum(nums, numsSize-1, target));
    }
    
    bool checkValid(int *nums, int numsSize, int target) {
        for (int i = numsSize-1; i >= 0; i--) {
            if (nums[i] > target) {
                return (false);
            }
        }
        
        return (true);
    }
    
    bool canPartition(int* nums, int numsSize) {
        int targetSum = 0;
        for (int i = 0; i < numsSize; i++) {
            targetSum += nums[i];
        }
        
        if (targetSum%2) {
            return (false);
        } else {
            qsort(nums, numsSize, sizeof(int), cmpfunc);
            if (checkValid(nums, numsSize, targetSum/2)) {
                return (checkSum(nums, numsSize, targetSum/2));
            } else {
                return (false);
            }
        }
    }
    

    This should have been slower than 3ms! This is backtracking and not DP


  • 0
    Z

    @1337c0d3r I am not sure whether this has been fixed. My backtracking or DFS solution is currently 3 ms, fastest..... I have one optimization over the previous DFS solution, which is sorting the numbers and one of the two sets has to contain the largest number.

    class Solution {
    public:
        bool canPartition(vector<int>& nums) {
            int sum = 0;
            for (int i:nums) sum += i;
            if (sum&1) return false;
            sum /= 2;
            sort(nums.rbegin(), nums.rend());
            return helper(nums, 1, nums[0], sum);
        }
    private:
        bool helper(vector<int>& nums, int cur, int tot, int tar) {
            if (cur == nums.size()) return tot == tar;
            for (int i = cur; i < nums.size(); i++) {
                if (tot+nums[i] <= tar && helper(nums, i+1, tot+nums[i], tar))
                    return true;
            }
            return tot == tar;
        }
    };
    

  • 1
    F

    Because you only need to find 1 instance then your DFS can call it a day and return true, you don't actually have to go through very deep. Whereas DP you have to loop every situation. If the problem changes to find all solutions, then DP will be faster.


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.