# I found an interesting OJ bug causing huge performance drop (about hashmap)

• My idea is like this: https://leetcode.com/discuss/20151/an-o-n-solution-with-detailed-explanation

Here are two solutions with different usage of hashmap.

The two solutions have the same steps but with little difference about hashmap usage. However the first solution costs 51ms and the second one costs 1300ms on OJ.

Note: Both cost approximately same time on my VS2010.

The first solution has the following core steps:

1. Put every word in L into a hashmap called `hashmap`.

2. Traverse S with two pointers, i and j. Put substring starting from j with the same length of word into a hashmap called `temphash`, and move j forward. for every substring already in `temphash`, add the string's value with 1. If the `temphash` has a string with value more than `hashmap`, move pointer i forward.

The second solution is slightly different in step 2. Instead of inserting strings to `temphash`, second solution initialize `temphash` with `hashmap`, and for every substring, subtract the string's value with 1 in `temphash`. And when the substring's value in `temphash` is -1, move i forward.

See? For every substring already in temphash, the first solution will set value plus 1 and the second solution minus 1.

The consumed time? The first solution 51ms, and the second solution 1300ms.

C++ AC code like this:

First solution:

``````class Solution {
public:
vector<int> findSubstring(string S, vector<string> &L) {
vector<int> res;
unordered_map<string, int> hashmap, temphash;
int lenl = L.size();
if(lenl==0) return res;
int len_word = L[0].size();
for(int i=0; i<lenl; i++) {
hashmap[L[i]]++;
}

int lens = S.size();

for(int start=0; start<len_word; start++) {
int i=start, j=start;
temphash.clear();
while(j<=lens-len_word){
string newword = S.substr(j, len_word);
if(hashmap.count(newword) == 0) {
i = j + len_word;
j = i;
temphash.clear(); // re-init temphash to original
continue;
}

temphash[newword]++;
j += len_word;
// newword is in dict, but might not be enough
while(temphash[newword] > hashmap[newword]) {
temphash[S.substr(i, len_word)]--;
i += len_word;
}

if(j-i==lenl*len_word) { // found a substring
res.push_back(i);
temphash[S.substr(i, len_word)]--;
i += len_word;
}
}
}
return res;
}
};
``````

Second solution:

``````class Solution {
public:
vector<int> findSubstring(string S, vector<string> &L) {
vector<int> res;
unordered_map<string, int> hashmap, temphash;
int lenl = L.size();
if(lenl==0) return res;
int len_word = L[0].size();
for(int i=0; i<lenl; i++) {
hashmap[L[i]]++;
}
temphash = hashmap;
int lens = S.size();

for(int start=0; start<len_word; start++) {
int i=start, j=start;
temphash = hashmap;
while(j<=lens-len_word){
string newword = S.substr(j, len_word);
if(temphash.count(newword) == 0) {
i = j + len_word;
j = i;
temphash = hashmap; // re-init temphash to original
continue;
}

temphash[newword]--;
j += len_word;
// newword is in dict, but might not be enough
while(temphash[newword] == -1) {
temphash[S.substr(i, len_word)]++;
i += len_word;
}

if(j-i==lenl*len_word) { // found a substring
res.push_back(i);
temphash[S.substr(i, len_word)]++;
i += len_word;
}
}
}
return res;
}
};``````

• The reason that second solution is slow could be that too much time is spent on copying the hashmap to temphash.

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.