Naive C++ Solution using two unordered_map (about 20 lines)


  • 78

    I think the following code is self-explanatory enough. We use an unordered_map<string, int> counts to record the expected times of each word and another unordered_map<string, int> seen to record the times we have seen. Then we check for every possible position of i. Once we meet an unexpected word or the times of some word is larger than its expected times, we stop the check. If we finish the check successfully, push i to the result indexes.

    class Solution {
    public:
        vector<int> findSubstring(string s, vector<string>& words) {
            unordered_map<string, int> counts;
            for (string word : words)
                counts[word]++;
            int n = s.length(), num = words.size(), len = words[0].length();
            vector<int> indexes;
            for (int i = 0; i < n - num * len + 1; i++) {
                unordered_map<string, int> seen;
                int j = 0;
                for (; j < num; j++) {
                    string word = s.substr(i + j * len, len);
                    if (counts.find(word) != counts.end()) {
                        seen[word]++;
                        if (seen[word] > counts[word])
                            break;
                    } 
                    else break;
                }
                if (j == num) indexes.push_back(i);
            }
            return indexes;
        }
    };

  • 2
    H

    Your answer are alwasy be one of the best.


  • 0

    Hi, huzhp. Thank you. Well, this solution is easy but slow. You may want to refer to the highest rank solutions for more efficient ones.


  • 0
    J

    Your answer is always one of the best!!!!!!!!!!!!!!!!!


  • 1

    Hi, jiangcheng. Thank you for your remarks. Well, this solution is easy but slow. You may want to refer to the highest rank solutions for more efficient ones.


  • 0
    R

    thanks for sharing, your code is clean and easy to read


  • 0
    R

    what is the time and space complexity?


  • 2

    Hi, @rabeeh. The space complexity is easy to conclude: the major additional space usage is counts and seen, both of which consumes O(words.size()) space.

    For the time complexity, it depends on the two for loops. The outer loop will loop for O(n) times and the inner loop will loop for O(num) times. So the time complexity is at most O(n * num).

    Correct me if you have different ideas.


  • 0

    Thank you for your solution-sharing, I have learned a lot from your solution.


  • 1
    S

    Time complexity of this solution is O(n * num * len), which is sub-optimal.


  • -5
    S

    Python version with the same idea but no nested loop. The only problem is that my code couldn't pass largest test case.

    def findSubstring(self, s, words):
        """
        :type s: str
        :type words: List[str]
        :rtype: List[int]
        """
        if not words: return []
        table = {}
        length = len(words[0])
        for word in words:
            table[word] = table.get(word, 0) + 1
        start = 0
        end = length - 1
        result = []
        appeared = {}
        while start <= len(s) - length * len(words) and end < len(s):
    
            word = s[end-length+1:end+1]
                
            # when current word valid
            if word in table and appeared.get(word, 0) < table.get(word):
                appeared[word] = appeared.get(word, 0) + 1
                # finished a concatennate
                if len(appeared) == len(table) and appeared == table:
                    result.append(start)
                    start = start + length
                    end = start + length - 1
                    appeared = {}
                # not yet find all
                else:
                    end = end + length
            else:  # not a valid, simply
                start += 1
                end = start + length - 1
                appeared = {}
    
        return result

  • 1

    It makes no sense to post an unaccepted code...


  • 0

    The approach is easy to understand. I had a similar code, the run time is not the optimal. n - total * len + 1 does the improvement especially when m and n are large.

    O(mn) time complexity. n is size of string s, and m is length of word len. Thanks for sharing, seems Leetcode is OK with O(mn) solution here.

    class Solution {
    public:
        vector<int> findSubstring(string s, vector<string>& words) {
            int n = s.size(), total = words.size(), len = words[0].size(); 
            vector<int> answer;
    
            unordered_map<string, int> freq;
            for(auto w: words){
                freq[w]++;
            }        
    
            for(int i = 0; i < n - total * len + 1; i++){
                int j = i;
                unordered_map<string, int> stats = freq;
                int covered = 0;
                while(covered < total && j + len - 1 < n) {
                    string word = s.substr(j, len);
                    if(stats.find(word) != stats.end() && --stats[word] >= 0) { 
                        covered++;
                        j += len;
                    }else break;
                }
                if(covered == total) answer.push_back(i);
            }
    
            return answer;
        }
    };
    

  • 0
    T

    Similar solution but only use one hash map

    class Solution {
    public:
        vector<int> findSubstring(string s, vector<string>& words) {
            vector<int> res;
    	if(s == "" || words.size() == 0)
    		return res;
    		
    	//build dictionary, int value is # occurrences of string w
    	unordered_map<string,int> map;
    	for(auto w : words)
    		++map[w];
    		
    	int n = words.size();
    	int sl = s.size();
    	int wl = words[0].size();
    			
    	for(int start=0; start < sl - wl*n + 1; start++){ //iterate all starting indices
    		string r;
    		int cur = start;
    		string w = s.substr(cur, wl);
    		while(map[w]> 0){
    			--map[w];
    			r += w;
    			if(r.size() == wl*n){ //satisfy condition
    				res.push_back(start);
    				break;
    			}
    			cur+=wl;
    			w = s.substr(cur,wl);
    		}
    			
    		for(int i=0; i<r.size(); i+=wl){ //restore dictionary
    			string wt = r.substr(i,wl);
    			++map[wt];
    		}
    	}
    	return res;
        }
    };
    

  • 0
    J

    wonderful solution!


  • 0
    J

    @jianchao.li.fighter Easy to understand for beginners! thanks :)


  • 0
    B

    @jianchao.li.fighter Thanks for sharing. Something I cannot understand here in your code is in the

    if (seen[word] > counts[word]) 
        break;
    

    Shouldn't we account for all word instead of seeing single word the exceeds the count?
    For example,
    s="woowoofoo", dict=["woo", "foo"]
    when we start in your loop from i = 0, when we meet the second "woo" we will break and save index 0, is that wrong?


  • 0
    W

    @bpinspin Based on my understanding of the code, when meeting the 2nd "woo", it will break but not save the index 0.

    To answer your first question, if one word exceeds its count, it means current index i cannot be valid, so it is safe to break.


  • 2
    W

    Great solution! Python solution based on your idea.

    from collections import Counter, defaultdict
    
    class Solution(object):
        def findSubstring(self, s, words):
            """
            :type s: str
            :type words: List[str]
            :rtype: List[int]
            """
            wordBag = Counter(words)   # count the freq of each word
            wordLen, numWords = len(words[0]), len(words)
            totalLen, res = wordLen*numWords, []
            for i in range(len(s)-totalLen+1):   # scan through s
                # For each i, determine if s[i:i+totalLen] is valid
                seen = defaultdict(int)   # reset for each i
                for j in range(i, i+totalLen, wordLen):
                    currWord = s[j:j+wordLen]
                    if currWord in wordBag:
                        seen[currWord] += 1
                        if seen[currWord] > wordBag[currWord]:
                            break
                    else:   # if not in wordBag
                        break    
                if seen == wordBag:
                    res.append(i)   # store result
            return res
    

  • 1
    F

    @WKVictor like your variable names.


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.