Java DP Solution


  • 39

    Do you still remember how did you solve this problem? https://leetcode.com/problems/word-break/

    If you do know one optimized solution for above question is using DP, this problem is just one more step further. We iterate through each word and see if it can be formed by using other words.

    Of course it is also obvious that a word can only be formed by words shorter than it. So we can first sort the input by length of each word, and only try to form one word by using words in front of it.

    public class Solution {
        public static List<String> findAllConcatenatedWordsInADict(String[] words) {
            List<String> result = new ArrayList<>();
            Set<String> preWords = new HashSet<>();
            Arrays.sort(words, new Comparator<String>() {
                public int compare (String s1, String s2) {
                    return s1.length() - s2.length();
                }
            });
            
            for (int i = 0; i < words.length; i++) {
                if (canForm(words[i], preWords)) {
                    result.add(words[i]);
                }
                preWords.add(words[i]);
            }
            
            return result;
        }
    	
        private static boolean canForm(String word, Set<String> dict) {
            if (dict.isEmpty()) return false;
    	boolean[] dp = new boolean[word.length() + 1];
    	dp[0] = true;
    	for (int i = 1; i <= word.length(); i++) {
    	    for (int j = 0; j < i; j++) {
    		if (!dp[j]) continue;
    		if (dict.contains(word.substring(j, i))) {
    		    dp[i] = true;
    		    break;
    		}
    	    }
    	}
    	return dp[word.length()];
        }
    }
    

  • 0
    Q

    Suppose the word has a average length of k, total n words, we still have a TC : n^2* k^2. I copy your data, it's ETL.


  • 0

    @Qinger Your TC analysis is correct. I just submitted again, and still accepted...

    43 / 43 test cases passed.
    Status: Accepted
    Runtime: 466 ms
    Submitted: 0 minute ago


  • 1

    My first reaction to this question was also relating it to Word Break II. Your code is very clear and concise. Thumbs up! But employing Trie data structure is another valid option to accelerate it.


  • 0
    Q

    @shawngao Thanks a lot. It's my fault. I wrote it in C++ and missed the "break" after "dp[i] = true". Now, my code is accepted.


  • -1
    Y

    Hi shawngao

    Really clean solution, thanks for sharing.
    Just a small question, what if the test case is
    ["cat","dog","catdog", "cat", "catcat"]


  • 1
    Y

    @shawngao

    Hi shawngao

    I add two lines to make it pass ["cat","dog","catdog", "cat", "catcat"].
    Don't know if there is a better one.
    '''

    public static List<String> findAllConcatenatedWordsInADict(String[] words) {
        List<String> result = new ArrayList<>();
        Set<String> preWords = new HashSet<>();
        Arrays.sort(words, new Comparator<String>() {
            public int compare (String s1, String s2) {
                if(s1.length() == s2.length()) {                    //added
                    return s1.compareTo(s2);
                }
                return s1.length() - s2.length();
            }
        });
        
        for (int i = 0; i < words.length; i++) {
            if(i > 0 && words[i].equals(words[i - 1])) continue;     //added
            if (canForm(words[i], preWords)) {
                result.add(words[i]);
            }
            preWords.add(words[i]);
        }
        
        return result;
    }
    
    private static boolean canForm(String word, Set<String> dict) {
        if (dict.isEmpty()) return false;
        boolean[] dp = new boolean[word.length() + 1];
        dp[0] = true;
        for (int i = 1; i <= word.length(); i++) {
            for (int j = 0; j < i; j++) {
    	        if (!dp[j]) continue;
    	        if (dict.contains(word.substring(j, i))) {
    	            dp[i] = true;
    	            break;
    	        }
            }
       }
       return dp[word.length()];
    }
    

    '''


  • 0

    @youjiao said in Java DP Solution:

    ["cat","dog","catdog", "cat", "catcat"]

    I think your way to avoid duplicated words is correct. Seems there's no duplicates in all test cases right now. Thanks for pointing out anyway!


  • 0

    This is a good solution, but it's just close to TLE, if you don't have this line if (!dp[i]) continue;. Because String.substring() is in O(n) time now, this operation could cost much time in the process, and it could be 压死骆驼的稻草. :)

    But can't deny this solution is very good.


  • 0
    C

    @shawngao
    It is clean and correct solution. However, I still have a question for you. The requirement says "A concatenated word is defined as a string that is comprised entirely of at least two shorter words in the given array". If I have an array of words as below. "catsdogmouse" concatenates cats and dog, but it isn't considered the correct one according to your code.

    String[] words = new String[] { "cat", "cats", "catsdogmouse", "dog", "dogcatsdog", "hippopotamuses", "rat", "ratcatdogcat" };


  • 0

    @connlie

    A concatenated word is defined as a string that is comprised entirely of at least two shorter words in the given array

    mouse is not there :)


  • 0
    C

    @shawngao
    Thanks for reply.

    I confused about the sentence; thought about at least 2 short words. :)

    Thanks


  • 4

    150 ms - Using HashSet and dfs

    public List<String> findAllConcatenatedWordsInADict(String[] words) {
        List<String> list = new ArrayList<String>();
        Set<String> set = new HashSet(Arrays.asList(words));
    
        for(String word : words) {
            set.remove(word);
            if(dfs(word, set, "")) list.add(word);
            set.add(word);
        }
        return list;
    }
    
    private boolean dfs(String word, Set<String> set, String previous) {
        if(!previous.equals("")) set.add(previous);
        if(set.contains(word)) return true;
        
        for(int i = 1; i < word.length(); i++) {
            String prefix = word.substring(0,i);
            if(set.contains(prefix) && 
                dfs(word.substring(i,word.length()), set, previous+prefix)) {
                return true;
            }
        }
        return false;
    }

  • 0
    A

    @shawngao Hi, could you give more details why the TC is n^2* k^2? Except the sorting process, I think the TC for the main process is n*k^2. There are only 3 for loops. Thank you.


  • 1

    @ainiyouyou I didn't think about this carefully before. If we consider TC of word.substring(j, i) as k, then overall runtime complexity is n*k^3. Anyway it's not n^2*k^2.


  • 0
    A

    @shawngao Got it. Thanks.


  • -1
    M

    My python code is TLE by using the same DP method. I don't know why, Anyone could help me?

    class Solution(object):
        def Detect_wordBreak(self, s, wordDict):
            s_len=len(s);
            if s_len==0:
                return False;
    
            word_set=set(wordDict);
            dp=[False for _ in range(s_len+1)];
            dp[0]=True;
            for i in range(1, s_len+1):
                for j in range(i):
                    if dp[j] and s[j:i] in word_set:
                        dp[i]=True;
                        break;
                        
            return dp[s_len];
        
        def findAllConcatenatedWordsInADict(self, words):
            words_len=len(words);
            if words_len<3:
                return [];
    
            words.sort(key=lambda x:len(x), reverse=True);
            len_dict=dict();
            for i in range(words_len):
                len_dict[len(words[i])]=i+1;
    
            result_list=[];
            for i in range(words_len):
                if self.Detect_wordBreak(words[i], words[len_dict[len(words[i])]:]):
                    result_list.append(words[i]);
    
            return result_list;
    

  • 2
    B

    I wrote a python version, beats 98%
    as @livelearn suggested ,changed a little bit

    class Solution(object):
        def findAllConcatenatedWordsInADict(self, words):
            words = sorted(words,key=lambda t:len(t))
            word_dict = set()
            def isQualify(w):
                if w in word_dict:return True
                for i in range(1,len(w)):
                    if w[:i] in word_dict and isQualify(w[i:]):return True
                return False
            res = []
            for w in words:
                if isQualify(w):res.append(w)
                word_dict.add(w)
            return res
    

    old version:

    class Solution(object):
        def findAllConcatenatedWordsInADict(self, words):
            words = sorted(words,key=lambda t:len(t))
            len_word_dict = collections.defaultdict(set)
            def isQualify(w):
                if w in len_word_dict[len(w)]:return True
                for i in range(1,len(w)):
                    if w[:i]in len_word_dict[i] and isQualify(w[i:]):return True
                return False
            res = []
            for w in words:
                if isQualify(w): res.append(w)
                len_word_dict[len(w)].add(w)
            return res
    

  • 0
    L

    @beautifeng You could just use set instead of defaultdict and set. I guess it's a slight optimization, but technically set lookups are O(1)


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.