C++ DP solution with unique char and length check to cover extreme case as ["a*599998b", "a"] (detailed explanation)


  • 0

    As many follow coders have posted, this problem is a step further of 139. Word Break. However, as questioned by @nyunyunyunyu in post Challenge doesn't work, for test cases like ["a*599998b", "a"] is really the worst case and killer for regular DFS if not pre-check some conditions, i.e., such a waste of time to only find out finally nobody could match the last unique letter 'b' after going all the way through the lengthy 599998 letters.

    Inspired by this test case, if a word w can be concatenated by a subset from words w[i=0:k], then obviously we have

    1. Letter condition: each distinct letter of w has to be in the union of letters from w[i=0:k].
    2. Length condition: the shortest word length of w[i=0:k] has to be at most w.length()/2.

    Each of the two conditions is necessary for concatenation and they are cheap to check as long as we have some pre-process done for the given word list, not to mention we pre-process the word list by length sorting anyway for DFS.

    This problem has more than linear complexity with size constraints, so we should probably not heavily rely on total run time as a standard of performance as it highly depends on attributes of testing data.

       vector<string> findAllConcatenatedWordsInADict(vector<string>& words) {
          if (words.empty()) return {};
          sort(words.begin(), words.end(), [](string& s1, string& s2) { return s1.size() < s2.size(); }); 
          for (auto& w:words) {
            if (wordBreak(w, words[0].size())) res.push_back(w); 
            dict.insert(w); 
          }
          return res;
        }
        // if word w is concatenatable by words from dictionary "dict"
        bool wordBreak(string& w, int minL) {
            bool newChar = false; // w has a new char not in dict
            if (charset.size() < 26) for(char c:w) if(charset.insert(c).second) newChar = true;
            if (newChar) return false;
            vector<bool> dp(w.size()+1, false); // dp[L]: if w.substr(0,L) is concatenatable by dict 
            for (int L = minL; L <= w.size(); ++L) {
                if (dp[L] = dict.count(w.substr(0,L))) continue;
                for (int j = minL; j <= L-minL; ++j) if (dp[L] = dp[L-j] && dict.count(w.substr(L-j, j))) break;
            }
            return dp.back();
        }
        
        vector<string> res;          // all concatenatable words
        unordered_set<string> dict;  // all previous words 
        unordered_set<char> charset; // char set of all previous words
    

Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.