C++, 16ms, beating 99.86%, Simple idea, easy to understand


  • 4
    T

    Basic idea, in each position, check whether satisfies conditions.

    Problem, the check operation is tooooo slow. So, how to reduce the operation frequency? Using Hash.

    Hash 1, hash(str=c1c2..cn) = c1 + c2 + .. cn. Too simple? Yes, yet efficient. You could try!

    Hash 2, improved version, grouping by different position. For example, dict = ["abc", "def"], then our hash array is ['a' + 'd', 'b' + 'e', 'c' + 'f']. Given a str "123456", its hash is ['1' + '4', '2' + '5', '3' + '6']. Then compare this hash array with standard dict hash!

    In my code, all member variables have a suffix '_'; the hash array is called FingerPrint.

    // Author: Tian Xia
    class Solution {
     public:
      typedef vector<int> FingerPrint;
    
      vector<int> findSubstring(const string str, const vector<string>& dict) {
        vector<int> ret;
        word_len_ = SIZE(dict[0]);
        char_size_ = dict.size() * word_len_; 
    
        if (SIZE(str) < char_size_) {
          return ret;
        }
    
        dict_.clear();
        for (auto &w: dict) {
          dict_[w] += 1;
        }
    
        fp_ = gen_fingerprint();
        decltype(fp_) lfp(word_len_, 0);
        for (int p = 0; p < char_size_; ++p) {
          lfp[p % word_len_] += str[p];
        }
        if (equal(lfp, 0) && check(str, 0)) {
          ret.push_back(0);
        }
    
        for (int p = char_size_; p < SIZE(str); ++p) {
          lfp[(p) % word_len_] += -str[p - char_size_] + str[p];
          if (equal(lfp, p + 1) && check(str, p + 1 - char_size_)) {
            ret.push_back(p + 1 - char_size_);
          }
        }
    
        return ret;
      }
    
     protected:
      bool equal(const FingerPrint &fp, int start) {
        for (int p = 0; p < word_len_; ++p) {
          if (fp[(start + p) % word_len_] != fp_[p]) {
            return false;
          }
        }
        return true;
      }
    
      bool check(const string &str, int p) {
        map<string, int> freqs; 
        for (int d = p; d < p + char_size_; d += word_len_) {
          string substr = str.substr(d, word_len_);
          freqs[substr] += 1;
        }
    
        auto ite1 = freqs.begin(), ite2 = dict_.begin();
        while (ite1 != freqs.end() && ite2 != freqs.end()) {
          if (ite1->first == ite2->first && ite1->second == ite2->second) {
            ++ite1;
            ++ite2;
            continue;
          }
          return false;
        }
    
        return true;
      }
    
      FingerPrint gen_fingerprint() {
        FingerPrint ret(word_len_, 0);
        for (auto &pair: dict_) {
          for (int p = 0; p < word_len_; ++p) {
            ret[p] += pair.first[p] * pair.second;
          }
        }
    
        return ret;
      }
    
      map<string, int> dict_;
      int char_size_, word_len_;
      FingerPrint fp_;
    };

  • 0
    O

    In some cases, your answer maybe wrong. 1+2+3 = 2 + 2 + 2, 'a' + 'd' = 'b' + 'c' different str may have same hash.


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.