Actually, if we consider the max length of a word, we would obtain a large optimization


  • 0
    F

    We may know that we can use DP to solve this problem. For example, most people write the code like the following(C++):

    class Solution {
    public:
        /**
         * @param s: A string s
         * @param dict: A dictionary of words dict
         */
        bool wordBreak(string s, unordered_set<string> &dict) {
            if (s.empty()) {
                return true;
            }
            
            int n = s.size();
            bool f[n + 1];
            
            f[0] = true;
            for (int i = 1; i <= n; ++i) {
                f[i] = false;
                for (int j = i - 1; j >= 0; --j) {
                    if (!f[j]) {
                        continue;
                    }
                    string word = s.substr(j, i - j);
                    if (dict.find(word) != dict.end()) {
                        f[i] = true;
                        break;
                    }
                }
            }
            
            return f[n];
        }
    };
    

    It got Accepted in LeetCode and take 8ms, beat 51% submits. And yes we can do some little optimization, and reduce the time to 4ms, but actually the worst time complexity will not be improved.

    If you run your code in http://www.lintcode.com/en/problem/word-break/, maybe many of you will get TLE. For example, the code shown in "Hot questions" under this problem with most votes and second most votes still got TLE in LintCode.

    Actually, we just need to consider the max length of a word in the dictionary for the second loop in DP. That is, change for(int j=i-1; j>=0; j--) to for(int j=i-1; max(0, i - maxLen); j--) is better, where maxLen can be pre-calculated. Have you ever seen a word longer than 50 ?

    If we apply this consideration, the complexity would reduce from O(N^2) to O(N*k), where N is the length of the given string, and k is the max length of the word in the dictionary.

    The code is longer, but it is more efficient. Although this is not obvious in LeetCode. I suggest enhancing the data set, and add some big data. The current data set is too weak.

    class Solution {
    public:
        /**
         * @param s: A string s
         * @param dict: A dictionary of words dict
         */
        bool wordBreak(string s, unordered_set<string> &dict) {
            if (s.empty()) {
                return true;
            }
            
            int maxLen = getMaxlen(dict);
            int n = s.size();
            bool f[n + 1];
            
            f[0] = true;
            for (int i = 1; i <= n; ++i) {
                f[i] = false;
                for (int j = i - 1; j >= max(0, i - maxLen); --j) {
                    if (!f[j]) {
                        continue;
                    }
                    string word = s.substr(j, i - j);
                    if (dict.find(word) != dict.end()) {
                        f[i] = true;
                        break;
                    }
                }
            }
            
            return f[n];
        }
    
    private:
        int getMaxlen(unordered_set<string> &dict) {
            int maxLen = 0;
            for (unordered_set<string>::iterator it = dict.begin(); it != dict.end(); ++it) {
                string word = *it;
                maxLen = max(maxLen, (int)word.length());
            }
            return maxLen;
        }
    };
    

  • 0

    The complexity will change from O(N^2) to O(NK + L) where L is the size of the dictionary. If it is very large (as real dictionaries are), this could get much slower than O(N^2) for short strings. For the LeetCode test cases it seems to work good enough, though. And I do think that it is a good optimization overall.


  • 0
    F
    This post is deleted!

  • 0
    F

    You are right:)


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.