C++ Dynamic Programming simple and fast solution (4ms) with optimization


  • 90
    P

    We use a boolean vector dp[]. dp[i] is set to true if a valid word (word sequence) ends there. The optimization is to look from current position i back and only substring and do dictionary look up in case the preceding position j with dp[j] == true is found.

    bool wordBreak(string s, unordered_set<string> &dict) {
            if(dict.size()==0) return false;
            
            vector<bool> dp(s.size()+1,false);
            dp[0]=true;
            
            for(int i=1;i<=s.size();i++)
            {
                for(int j=i-1;j>=0;j--)
                {
                    if(dp[j])
                    {
                        string word = s.substr(j,i-j);
                        if(dict.find(word)!= dict.end())
                        {
                            dp[i]=true;
                            break; //next i
                        }
                    }
                }
            }
            
            return dp[s.size()];
        }

  • 0
    P

    Great job! My code is similar to yours, but I got a time cost of 15ms. It turns out that the push_back operation is time_consuming, and to create a memory in advance is better.

     class Solution {
    public:
        bool wordBreak(string s, unordered_set<string> &dict) {
            if(dict.find(s)!=dict.end())
                return true;
            else{}
            vector<int> pos(1,0);
            for(int i=1;i<=s.size();i++)
            {
                for(vector<int>::const_iterator it=pos.begin();it!=pos.end();it++)
                {
                    if(dict.find(s.substr(*it,i-*it))!=dict.end())
                        {
                            pos.push_back(i);
                            break;
                        }
                        else{}
                }
            }
            if(*(pos.end()-1)==s.size())
                return true;
            else
                return false;
        }
    };
    

    I have changed my code, and also get an 4 ms time cost.

    class Solution {
    public:
        bool wordBreak(string s, unordered_set<string> &dict) {
            if(dict.find(s)!=dict.end())
                return true;
            else{}
            vector<int> pos(1,0);
            for(int i=1;i<=s.size();i++)
            {
     
                for(int j=pos.size()-1;j>=0;j--)
                {
                    if(dict.find(s.substr(pos[j],i-pos[j]))!=dict.end())
                        {
                            pos.push_back(i);
                            if(i==s.size())
                              return true;
                            break;
                        }
                }
            }
                return false;
        }
    };

  • 0

    The code is great, I wish I could have the right mindset to think this way. Great code, thanks!


  • 0
    Z

    your optimiziation is wonderful.


  • 11
    R

    Inspired by paul7, if we find valid length of dict strings, we will reduce the time to 0 ms

    class Solution {
    public:
        bool wordBreak(string s, unordered_set<string> &dict) {
            if(dict.size()==0) return false;
    
            int longestWord = 0;
            for(string word : dict){
                longestWord = max(longestWord, (int)word.size());
            }
    
            vector<bool> dp(s.size()+1,false);
            dp[0]=true;
    
            for(int i=1;i<=s.size();i++)
            {
                for(int j=i-1;j>=max(i-longestWord, 0);j--)
                {
                    if(dp[j])
                    {
                        string word = s.substr(j,i-j);
                        if(dict.find(word)!= dict.end())
                        {
                            dp[i]=true;
                            break; //next i
                        }
                    }
                }
            }
    
            return dp[s.size()];
        }
    };

  • 0
    A

    Great solution, my algorithm needs three layers of circulation, and it is similarity with minimum number of multiple of matrix. And my solution takes 40ms. So bad...


  • 0
    S

    the time consume is shorter, because of you reversed the second for loop.(for(int j=pos.size()-1;j>=0;j--))。I think it is relative to the test case


  • 0
    G

    if the size of dict is very large, it might not be an optimization


  • 0
    F

    can someone explain why decrementing j from i-1 to 0 is faster than incrementing j from 0 to i-1?


  • 0
    P

    Starting lookup with shorter strings vs longer ones.


  • 4
    F

    If you run your code in http://www.lintcode.com/en/problem/word-break/, you will get TLE. Actually, we just need to consider the max length of a word in the dictionary in the second loop in DP. Maybe change for(int j = i-1; j >= 0; j--) to for(int j = i-1; j >= max(0, i - maxLen); j--) is better, where maxLen can be pre-calculated. Although for some case it may not be a good job (since we need to consider the complexity of the computation of maxLen), but if the size of the dictionary is not quite that large, it will be a good optimization.


  • 0
    G

    a good solution! very easy to understand!


  • 1
    A

    nice code.

    maybe some would find it easier to understand when traversing backwards:

        bool wordBreak(string s, unordered_set<string>& wordDict) {
            vector<bool> dp(s.length()+1, false);
            dp.back() = true;
            for(int ii=s.length() - 1; ii>= 0; ii--){
                for(int jj=ii; jj < s.length(); jj++){
                    if(dp[jj+1] == true && wordDict.find(s.substr(ii, jj-ii+1)) != wordDict.end()){
                        dp[ii] = true;
                        break;
                    }
                }
            }
            return dp[0];
        }
    

  • 0
    F

    @atruecubsfan Awesome! Thanks :)


  • 0
    P

    A minor observation. When I first saw this explanation "dp[i] is set to true if a valid word (word sequence) ends there" I assumed i meant index into string. But I believe this actually means length of the substring being matched. thats why you can set dp[0] = true since any zero length string can be considered valid.


  • 0

    Good job. I will remember this method forever.


  • 0

    @chapter Why do you say you'll remember it forever? Do you think this method is something that can used in a generic way and applied to a multitude of questions?


  • 0
    X

    why we set dp[0]=true?


  • 0
    L

    A few micro-optimizations cut it down to 3ms:

    class Solution {
    public:
        bool wordBreak(string s, vector<string>& wordDict) {
            set<string> word_set(wordDict.begin(), wordDict.end());
            int len = s.size(); // cache s.size()
            vector<bool> dp(len + 1, false);
            dp[0] = true;
            for (int i = 1; i <= len; ++i) { // use ++i instead of i++ to avoid extra copy
                for (int j = i - 1; j >= 0; --j) {
                    if (dp[j] && word_set.find(s.substr(j, i - j)) != word_set.end()) { // avoid unnecessary definition of a string copy
                        dp[i] = true;
                        break;
                    }
                }
            }
            return dp[len];
        }
    };
    

    Happy coding!


  • 0
    R

    Nice solution. One thing I notice is that a valid word ends at i - 1, instead of i. So next time when dp[ j ] == true, that j is the beginning of a new word. Looking back, that j is marked as true in previous loops via dp[ i ] = true.


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.