My 24ms C++ solution (O(N) time, using an unordered_map)

  • 7

    We can leverage the fact that the strings in words have the same length to speed up our algorithm. We can divide all the possible starting points (i.e. 0..sLen-1) into wordL sequences and the i-th sequence is
    (i, i+wordL, i+2*wordL, ...) (I=0..wordL-1). Each time we scan one sequence (for j loop). First, we need to build a dictionary based on words and unordered_map is used to simplify look-up operation (key is string, value is count). Then for each sequence starting at i, we check if s[i..i+wordL-1] is a word in dict, if no (which means it is not a word in words), then we reset count (the number of words we found in the current window starting at start), and move start to i+wordL to skip the current word and also we need to recover dict. If s[i..i+wordL-1] is a word in dict, then we check if we already found all such words before in the current window (i.e. --dict[s.substr(i, wordL)] < 0). If so, the current one is a redundant one, then we have to move forward start to remove one occurence of the current word (i.e. to make dict[s.substr(i, wordL)] == 0) so that the current word can be included in the window. We also need to update count accordingly. If we find all the words in the current window ( i.e. (++count == wSize)), then we save start to res and move start to remove one occurence of the current word to continue the search.
    We repeat the above process for each word sequence (i.e. each i). remember after each i iteration, we need to recover dict(map).

    class Solution {
        vector<int> findSubstring(string s, vector<string>& words) {
            unordered_map<string, int> map;
            int i, wSize = words.size(), wL=words[0].size(), start, cur, sL=s.size(), wCnt;
            vector<int> res;
            if(sL<wL*wSize) return res;
            //build the map
            for(i=0; i<wSize; ++i)
                map[words[i]] = map.count(words[i])>0? ++map[words[i]]:1;
            for(i=0; i<wL; ++i)
            {// go through each possible starting point sequences
                start = cur = i; // start is the starting point of the current search window, cur is the end of the current search window
                wCnt = wSize; // reset the words to be searched
                { // if it is a valid start
                    if(map.count(s.substr(cur,wL))==0){// if the current word is not one in the map, then move the starting window to the next word positon, reset wCnt and recover map counts.
                            for(wCnt = wSize; start!=cur; start+=wL) ++map[s.substr(start,wL)];
                            start +=wL; //skip the current invalid word;
                    else if(map[s.substr(cur,wL)]==0){
                     // if the current word is a valid word in the map, but it is already found in the current search window, then we have to move start to skip the previously found one, and update wCnt and map counts accordingly.
                        for(;s.substr(cur,wL)!=s.substr(start,wL); start+=wL)
                        start += wL;//skip the previously found one
                    // if the current word is a valid one and it is not found before in the current search window
                        --map[s.substr(cur,wL)]; // then reduce its counter
                        if(--wCnt == 0) { // update wCnt, if we find all the words
                            res.push_back(start); // save start
                            ++map[s.substr(start,wL)]; //moving start to skip the first word in the current search window
                            start +=wL;
                    cur+=wL; // update cur
                for(;start<cur;start+=wL)  ++map[s.substr(start,wL)];//reset the map count
            return res;

  • 0

    //recover dict for the next word sequence
    while(start<i) {++dict[s.substr(start, wordL)]; start+=wordL;}

    This is to sacrifice time to gain space.

  • 0

    I think it will be more accurately to say that this solutions has (lenght of string) * (lenght of each word in dictionary) time complexity.

  • 0

    Time complexity = O(N * Len(word))

Log in to reply

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.