c++ DP solution with comments


  • 61
    int findMaxForm(vector<string>& strs, int m, int n) {
      vector<vector<int>> memo(m+1, vector<int>(n+1, 0));
      int numZeroes, numOnes;
    
      for (auto &s : strs) {
        numZeroes = numOnes = 0;
        // count number of zeroes and ones in current string
        for (auto c : s) {
          if (c == '0')
    	numZeroes++;
          else if (c == '1')
    	numOnes++;
        }
    
        // memo[i][j] = the max number of strings that can be formed with i 0's and j 1's
        // from the first few strings up to the current string s
        // Catch: have to go from bottom right to top left
        // Why? If a cell in the memo is updated(because s is selected),
        // we should be adding 1 to memo[i][j] from the previous iteration (when we were not considering s)
        // If we go from top left to bottom right, we would be using results from this iteration => overcounting
        for (int i = m; i >= numZeroes; i--) {
    	for (int j = n; j >= numOnes; j--) {
              memo[i][j] = max(memo[i][j], memo[i - numZeroes][j - numOnes] + 1);
    	}
        }
      }
      return memo[m][n];
    }
    

  • 2
    F

    is memo[a][b] initialized somehow ? I don't get how it could be updated from the largest to the smallest while the larger one is calculated from smaller one ?

    this sentence I don't understand:
    memo[i][j] = max(memo[i][j], memo[i - numZeroes][j - numOnes] + 1);

    Can you give an example ? Thanks.


  • 17

    @fanslunary Yes, all entries of memo are initialized to 0 by vector<vector<int>> memo(m+1, vector<int>(n+1, 0));

    The reason we have to the update from bottom right to top left is stated in the comments: we need to calculate values for this iteration given values from the last iteration. If we do it the other way around, memo[i][j] is calculated with values from this iteration, which results in overcounting.
    memo[i][j] = max(memo[i][j], memo[i - numZeroes][j - numOnes] + 1); This line says:

    1. There are two possible ways to form the max number of strings with i 0's and j 1's regarding s: we either form s or skip it.
    2. If we skip s, memo[i][j] shouldn't change.
    3. Otherwise, we form s with numZeroes 0's and numOnes 1's, which leaves us i - numZeroes 0's and j - numOnes 1's to work with for all previous strings. How many strings can we form with i - numZeroes 0's and j - numOnes 1's? It's memo[i - numZeroes][j - numOnes] which was calculated in previous rounds, so just add 1 to that.
    4. We choose to form s or skip it based on which of 2 and 3 gives us a larger memo[i][j]

  • 0
    S

    Nice solution! I just re-wrote the code following your idea. Thanks!

    int findMaxForm(vector<string>& strs, int m, int n) {
        vector<vector<int>> dp(m+1,vector<int>(n+1,0));
        
        for (auto &s: strs) {
            int ones = count(s.begin(), s.end(), '1');
            int zeros= s.size()-ones;
            for (int i=m; i>=zeros; i--) 
                for (int j=n; j>=ones; j--)
                    dp[i][j] = max(dp[i][j], dp[i-zeros][j-ones]+1);
        }
        return dp[m][n];
    }

  • 0
    S

    Can you share with us how you came up with this solution? I am finding it difficult to understand even with the working solution that your provided. What was your thought process?

    Thank you for sharing.


  • 16

    @sid77 The problem is really a knapsack-ish problem, so I knew DP was the way to go. In this problem, we have two knapsacks, one with capacity m and another with capacity n. The 'weight' of each item is the number of ones and zeroes in it. And the 'value' of each item is exactly one, since we just want to maximize the number of items. So for each item, i.e. s, we calculate its weight first and use DP for the rest. If you are unfamiliar with knapsack, maybe reading this will help a little bit.


  • 0
    D

    @yangluphil Hi Phil, nice solution. Is there a general pattern on which kind of problem needs to consider over-counting issue or almost all knapsack problems need to consider it?


  • 0

    @dukeforever No the overcounting issue is there because we update the DP for each string sequentially, and in our recurrence relation, cells on the bottom right generally depend on cells on top left. You can think about this DP table as a space-optimized 3D table. To solve vanilla knapsack, you don't need to do this, i.e., one scan of DP is enough.


  • 0

    memo[i][j] = the max number of strings that can be formed with exactly i 0's and j 1's
    Or the max number of strings that can be formed with at most i 0s and j 1s.
    Thanks.


  • 0
    Z

    I explain the 0-1 knapsack problem in https://discuss.leetcode.com/topic/76103/0-1-knapsack-detailed-explanation, hope it would help


  • 0

    @0x0101 said in c++ DP solution with comments:

    the max number of strings that can be formed with at most i 0s and j 1s.

    This.


  • 0
    Y

    @yangluphil Hi, This is a fantastic solution. I wonder if there is a way to save the 3D recursive DP from "Memory Limit Exceeded" (MLE).
    It seems the bottom-up and top-down are mostly like convertible to each other.

    The dp[m][n][k] is the number of terms can be included given m-zeros, n-ones budget and choices after k. The solution will be dp[m][n][0].

    The following solution passed all tests except the last one (test 60), which throws MLE.

    Thank you.

    class Solution {
      private:
        vector<pair<int,int>> data; // 0 and 1
        pair<int,int> cnt_zero_and_one( const string & str ){
          int x = 0, y = 0;
          for( const auto & c : str ){
            if( c == '0' ) x ++;
            else y ++;
          }
          return make_pair(x,y);
        }
        int find_max( 
            int m, int n, int idx,
            vector<vector<vector<int16_t>>> & dp,
            const vector<pair<int,int>> & data) const {
          if( (n < 0 && m >= 0) || (n >= 0 && m < 0) || idx >= data.size() ) return 0;
          if( dp[m][n][idx] != -1 ) return dp[m][n][idx];
          int p = 0, q = 0;
          p = find_max( m, n, idx+1, dp, data );
          if( m >= data[idx].first && n >= data[idx].second ){
              q = 1 + find_max( m - data[idx].first, 
              n - data[idx].second, idx+1, dp, data );
          }
          dp[m][n][idx] = max( p, q );
          return dp[m][n][idx];
        }
    
    
      public:
        int findMaxForm(vector<string>& strs, int m, int n) {
          if( strs.empty() || m < 0 || n < 0 ) return 0;
          data.reserve( strs.size() );
          for( const auto & s : strs ){
            data.emplace_back( cnt_zero_and_one( s ) );
          }
          int k = strs.size()-1;
          strs.clear();
          vector<vector<vector<int16_t>>> dp(m+1, 
              vector<vector<int16_t>>( n+1, 
                vector<int16_t>( k+1, -1 ) ) );
          dp[0][0][k] = 0;
          return find_max( m, n, 0, dp, data );
        }
    };
    

  • 0

    @yren2 Hi, I don't think there is a way to avoid MLE if you use O(n^3) space. The only parameter we can avoid is the index parameter for the DP table. But in a top-down recursive approach, keeping track of the index seems necessary to me :(. The bottom-up approach is able to save space because we solve all the subproblems for any particular index at once.


  • 0
    A

    @yangluphil very nice solution, seems that down->top(m,n -> 0,0) can be passsed. But top->down(0,0 -> m,n) solution fails with MLE. Top->down(0,0 -> m,n) solution has to use O(n^3) space to record the "resource allocation", which also makes the solution very complex, while down->top(m,n -> 0,0) is able to ignore it.


  • 0
    I
    This post is deleted!

  • 0
    J

    Nice solution and explanation. Helps a lot. Many thanks.


  • 0
    L

    I think there is something wrong with this method. What if there doesn't have a solution? Like "0", "1", 2 2. But your solution will give answer "1". You didn't judge whether the condition can be reached before translation.


  • 0

    @litrain said in c++ DP solution with comments:

    I think there is something wrong with this method. What if there doesn't have a solution? Like "0", "1", 2 2. But your solution will give answer "1". You didn't judge whether the condition can be reached before translation.

    I am not sure what you mean. ["0", "1"], 2, 2 should yield 2 per problem description. And my code returns just that.


  • 0
    L

    @yangluphil Thanks for reply. I misunderstood the problem. I thought every '0' or '1' must of selected string must counts, but the problem says "at most once".


  • 0
    C

    @yangluphil I think memo[i][j] = the max number of strings that can be formed with exactly i 0's and j 1's.
    For example, if m=5, n=3 and there's only one string, "0011", then the result would be memo[2][2] = 1 and all others being 0. And your program will return 0, which is wrong. So I think the test cases need to be improved.


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.