[FB phone interview] Word Break problem - Given a string s and a dictionary, check whether the string can be broken down to valid words.


  • 5
    R

    Word Break problem - Given a string s and a dictionary, check whether the string can be broken down to valid words. Initially I was asked to return true or false based on whether string can be broken down or not. Later I was asked to modify my code to return the number of possible ways to break the string. Like for eg : "takebathandcome" can be broken down in 2 ways

    1. "take bath and come"
    2. "take bat hand come"
      So the modified function should return 2

  • 5
    B

    @ratnanireshma Just curious, was this for full-time or internship?


  • 2
    R

    Simple java solution (without dp)

    public static int find(String text,Set<String> words){
    	int count = 0;
    	if(text.length() == 0){
               return 1;
    	}
    	for(int i=1;i<=text.length();i++){		
    	    String subStr = text.substring(0,i);
    	    String remainingText = text.substring(i,text.length());
    	   if(words.contains(subStr) && find(remainingText, words)  > 0){
    			count =count+find(remainingText, words);
    	   }
    	}
    return count;
    }
    

  • 0
    A

    Use the Trie dictionary as a finite state machine. Store the running result (already formed words) in each pointer. Then at the end return all the pointers that end in an accept state. As you move each pointer if it reaches an accept state then move a copy of that pointer to the start with the new word to the running result.


  • 1
    S

    My Normal OC Solution

    - (BOOL)checkWord:(NSString *)s{
         ...
    }
    - (void)breakWords:(NSMutableArray *)breakWords tmp:(NSMutableString *)tmp sentence:(NSString *)sentence{
        for (int i = 0; i < sentence.length; i++){
            
            if ([self checkWord:[sentence substringWithRange:NSMakeRange(0,i+1)]]){
                
                if (nil == tmp) tmp = [[NSMutableString alloc] init];
                
                [tmp appendFormat:@"%@ ",[sentence substringWithRange:NSMakeRange(0,i+1)]];
                
                if (i == sentence.length - 1){
                    
                    [breakWords addObject:tmp];
                    return;
                }
                NSMutableString *newTmp = [tmp mutableCopy];
                [self breakWords:breakWords tmp:newTmp sentence:[sentence substringFromIndex:i+1]];
                
                tmp = [[NSMutableString alloc] initWithString:[tmp substringToIndex:tmp.length-i-2]];
                
            }
        }
    }
    

  • 4
    B

    My code is based on this solution of LC 139.

    int wordBreak(string s, unordered_set<string>& wordDict) {
    	int m = s.size();
    	vector<int> dp(m + 1, 0); //table of # of breaking down ways instead of true/false
    	dp[0] = 1;
    	for (int i = 1; i <= m; i++){
    		for (int j = i - 1; j >= 0; j--){
    			if (dp[j] && wordDict.find(s.substr(j, i - j)) != wordDict.end()){
    				dp[i] += dp[j];    
    			}
    		}
    	}
    	return dp[m];
    }
    

  • 0
    A

    C++ recursive

    int wordBreak(const string& s, int pos, const unordered_set<string>& wordDict) {
        if (pos == s.size()) return 1;
        int cases = 0;
        for(int i = 1; i <= s.size() - pos; ++i) {
            if (wordDict.find(s.substr(pos, i)) != wordDict.end()) {
              cases += wordBreak(s, pos+i, wordDict);
            }
        }
        return cases;
    }
    
    int wordBreak(const string& s, const unordered_set<string>& wordDict) {
        return wordBreak(s, 0, wordDict);
    }
    

  • 0
    S

    @ratnanireshma
    I guess this problem can be solved using Aho Corasick and Dynamic programming on the resulting trie.
    Dp[node][pos_in_string]
    We can go down in tree, or if it is a terminal node, we can split.


  • 0
    B
    This post is deleted!

  • 0
    B

    Time complexity O(mn) n = len(s), m=len(wordDict)

    class Solution(object):
        def wordBreak(self, s, wordDict):
            """
            :type s: str
            :type wordDict: List[str]
            :rtype: bool
            """
            if len(s) < 1 or len(wordDict) < 1: return None
            self.memo = {}
            self.cnt = 0
            self.helper(s, wordDict)
            return self.cnt
    
        def helper(self, s, wordDict):
            for word in wordDict:
                    if s == "" : 
                            self.cnt += 1
                            return True
                    if not s.startswith(word):
                            continue
                    else: 
                            if word+'-'+s in self.memo:
                                    if self.memo[word+'-'+s]: self.cnt += 1
                                    return self.memo[word+'-'+s]
                            retVal = self.helper(s[len(word):],wordDict)
                            if retVal: 
                                    self.memo[word+'-'+s] = True
                            else:
                                    self.memo[word+'-'+s] = False
            return False
    

  • 0
    E

    2 different versions:

    1. multiple uses of the same word is allowed:
    def word_break_duplicate(s, words):    
        if not s or not words: return s
        W = set(words)
        lengths = set(len(w) for w in words)
        return wb_duplicate(0, s, W, {}, lengths)
    
    def wb_duplicate(piv, s, words, cache, lengths):
        if piv in cache: return cache[piv]
        if piv == len(s): return 1
        cnt = 0
        for i in lengths:
            nxt = piv + i
            if nxt <= len(s):
                word = s[piv:nxt]
                if word in words:
                    cnt += wb_duplicate(nxt, s, words, cache, lengths)
        cache[piv] = cnt
        return cnt
    
    1. Can only use each word once. (same word might be given multiple times though)
    
    def word_break(s, words):    
        if not s or not words: return s
        W = collections.Counter(words)
        lengths = set(len(w) for w in W.keys())
        return wb(0, s, W, {}, lengths)
    
    def wb(piv, s, words, cache, lengths):
        if piv in cache: return cache[piv]
        if piv == len(s): return 1
        cnt = 0
        for i in lengths:
            nxt = piv + i
            if nxt <= len(s):
                word = s[piv:piv+i]
                if word in words and words[word] > 0:
                    words[word] -= 1
                    cnt += wb(nxt, s, words, cache, lengths)
                    words[word] += 1
        cache[piv] = cnt
        return cnt
    

  • 0
    D
    package newQuestions;
    
    import java.util.HashMap;
    import java.util.HashSet;
    import java.util.Map;
    import java.util.Set;
    
    public class WordBreakProblem {
    
        public static void main(String[] args) {
    
            Set<String> objects = new HashSet<>();
    
            objects.add("this");
            objects.add("is");
            objects.add("a");
            objects.add("string");
            objects.add("which");
            objects.add("can");
            objects.add("canneed");
            objects.add("need");
            objects.add("be");
    
            String input = "ThisIsCanNeedBe".toLowerCase();
            System.out.println(findWords(0, input.length(), input.toLowerCase(), objects, new HashMap<>()));
        }
    
    
        public static int findWords(int start, int end, String input, Set<String> dict, Map<String, Integer> memorization) {
    
            String key = start + "_" + end;
            if (memorization.containsKey(key)) {
                return memorization.get(key);
            }
    
            System.out.println("start:"+ start + ",end:"+ end);
            if (start >= end) {
                return 0;
            }
    
            int countWords = 0;
            for (int index = start; index < end; index++) {
                String firstPart = input.substring(start, index);
                String secondPart = input.substring(index, end);
                if (isWord(firstPart, dict) && isWord(secondPart, dict)) {
                    countWords +=1;
                }
                int words = findWords(start, index, input, dict, memorization);
                int words1 = findWords(index + 1, end, input, dict, memorization);
                countWords += words * words1;
            }
    
            memorization.put(key, countWords);
            return countWords;
        }
    
    
        public static boolean isWord(String testWord, Set<String> dict) {
            return (testWord.length() == 0 || dict.contains(testWord));
        }
    }
    
    

  • 0
    P
    This post is deleted!

Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.