Approach #1 Brute Force [Accepted]
Algorithm
Start two pointers i
and j
from the beginning of a haystack and a needle respectively. Compare every element of the needle with every element of the haystack if there is a match keep moving pointers, otherwise, move the pointer of the needle to its head and eliminate all previous matches by moving the pointer of the haystack j
steps back and adding 1
to start new comparison. In other words move i
by ij+1
.
C++
int strStr(string haystack, string needle) {
if(needle.length() == 0) {
return 0;
}
if(haystack.length() == 0) {
return 1;
}
int i = 0, j = 0;
while(i<haystack.length()) {
if(haystack[i] == needle[j]) {
i++;
j++;
} else {
i = i  j + 1;
j = 0;
}
if(j == needle.length()) {
return ij;
}
}
return 1;
}
Complexity Analysis
 Time complexity : $$O(nm)$$.
Because in the worst case we compare every element of the needle with every element element of the haystack, the total time complexity is $$O(nm)$$, where n
and m
are lengths of the haystack and the needle respectively.
 Space complexity : $$O(1)$$.
Approach #2 Substring Match [Accepted]
Algorithm
Start the pointer i
from the head of a haystack. Compare every substring of the haystack with length equal to the length of the needle. If there is a match return the index of the pointer i
.
C++
int strStr(string haystack, string needle) {
if(needle.length() == 0) {
return 0;
}
if(haystack.length() == 0) {
return 1;
}
if(haystack.length() < needle.length()) {
return 1;
}
for(int i = 0; i<haystack.length()  needle.length() + 1; i++) {
if(haystack.substr(i, needle.length()) == needle) {
return i;
}
}
return 1;
}
Complexity Analysis
 Time complexity : $$O(nm)$$.
Iteration through the haystack takes n
time and substr
operation of STL
takes m
time, the total time complexity is $$O(nm)$$, where n
and m
are lengths of the haystack and the needle respectively.
 Space complexity : $$O(1)$$.
Approach #3 Knuth–Morris–Pratt (KMP) Algorithm [Accepted]
Algorithm
In contrast to naive algorithm KMP algorithm stores the history of previous matches of the haystack and the needle. For this reason lets run two pointers i
and j
starting from both heads of the haystack and the needle. If characters at these pointers match then move both pointers, otherwise, lets find a suffix
which is also a prefix
in the needle. This means that this suffix
is also a substring right before the pointer i
in the haystack. Consequently, if this suffix
is also a prefix
we can start our pointer j
right after that prefix
in the needle. In this approach we can avoiding moving the pointer i
backwards.
This means that we need to store the history of matches between a suffix
and a prefix
in the needle and what point to start the matching if there is a mismatch between the haystack and the needle.
Preprocessing Longest Suffix Prefix
In order to store information for the longest suffix prefix run two pointers from the first j
and the second i
characters of the needle respectively. If there is a match between characters store the length of the match, otherwise, leave it 0
.
Lets consider the following example
 haystack
aabdacabdab
 needle
abdab
C++
vector<int> longestSuffixPrefix(string needle) {
vector<int> lsp(needle.length(), 0);
int i = 1, j = 0;
while(i<needle.length()) {
if(needle[i] == needle[j]) {
j++;
lsp[i] = j;
i++;
} else {
if(j != 0) {
j = lsp[j1];
} else {
i++;
}
}
}
return lsp;
}
Running KMP algorithm on the haystack and the needle
C++
int kmp(string haystack, string needle) {
vector<int> lsp = longestSuffixPrefix(needle);
int i = 0, j = 0;
while(i<haystack.length()) {
if(haystack[i] == needle[j]) {
i++;
j++;
} else {
if(j != 0) {
j = lsp[j1];
} else {
i++;
}
}
if(j == needle.length()) {
return i  j;
}
}
return 1;
}
int strStr(string haystack, string needle) {
if(needle.length() == 0) {
return 0;
}
if(haystack.length() == 0) {
return 1;
}
return kmp(haystack, needle);
}
Complexity Analysis
 Time complexity : $$O(n+m)$$.
Preprocessing of the longest suffix prefix array takes m
time and the searching for the needle in the haystack takes n
respectively, the total time complexity is $$O(n+m)$$, where n
and m
are lengths of the haystack and the needle respectively.
 Space complexity : $$O(m)$$.
We need m
space to store information on the longest suffix prefix substring.
Approach #4 Z Algorithm [Accepted]
Algorithm
Z Algorithm
generates an array Z
on a string S
where every element of Z
array Z[i]
is the longest substring starting from a index i
which is also a prefix of the string S
.
The idea is to concatenate the needle with the haystack using #
character as a separation point. In this case the needle will be the prefix of a resulting string.
needle#haystack
 haystack
aabdacabdab
 needle
abdab
Running Z Algorithm on the haystack and the needle
Lets construct Z array
Lets maintain the interval [l, r]
where a substring based on this interval is a prefix of the given string. The index i
will point to the current position in the string.

if
i>r
then there is no substring which is also a prefix that starts beforei
and ends afteri
. Consequently, resetl
andr
and compute new interval[l, r]
and getz[i] = rl
. 
if
i≤r
then letk=il
.z[i]≥min(z[k], ri+1)
becauses[i..]
matchess[k..]
for at leastri+1
characters.
if
z[k]<ri+1
means that we are in the substring that matches with the prefix of the given string. Instead of recalculating values for the current interval lets use values that were already computed. 
if
z[k]≥ri+1
we are beyond the interval that was already processed. Consequently, computez[i]
by updating[l, r]
and match froms[r+1]

C++
vector<int> zArray(string s) {
int n = (int)s.length();
vector<int> z(n, 0);
int l = 0, r = 0, k;
for(int i = 1; i<s.length(); i++) {
if(i>r) {
l = r = i;
while(r<n && s[rl] == s[r]) {
r++;
}
z[i] = rl;
r;
} else {
k = il;
if(z[k] < ri+1) {
z[i] = z[k];
} else {
l = i;
while(r<n && s[rl] == s[r]) {
r++;
}
z[i] = rl;
r;
}
}
}
return z;
}
Search for the needle in the haystack
int strStr(string haystack, string needle) {
if(needle.length() == 0) {
return 0;
}
if(haystack.length() == 0) {
return 1;
}
vector<int> z = zArray(needle + "#" + haystack);
for(int i = 0; i<z.size(); i++) {
if(z[i] == needle.length()) {
return i  needle.length()  1;
}
}
return 1;
}
Complexity Analysis
 Time complexity : $$O(n+m)$$.
The time complexity is $$O(n+m)$$ because characters at positions less than r
are never compared and if a match is found we increase r
by one, the total time complexity is $$O(n+m)$$, where n
and m
are lengths of the haystack and the needle respectively.
 Space complexity : $$O(m+n)$$.
We need m+n
space to store information on the Z array.
Approach #5 RabinKarp Algorithm [Accepted]
Algorithm
The idea is based on the RabinKarp algorithm to hash the string. Calculate the hash value for the string needle and then calculate hash values for every prefix substring of the string haystack. Iterate through hashed values of the haystack and compare all substring of length m
the needle length in constant time.
Calculate hash using
haystackHash[i] = (haystack[i]  'a' + 1) * pows[i]
where pows
are degress of hash.
int rabinKarp(string haystack, string needle) {
const int p = 31;
vector<long long> pows(haystack.length());
pows[0] = 1;
for(int i=1; i<pows.size(); i++) {
pows[i] = pows[i1] * p;
}
vector<long long> haystackHash(haystack.length());
for (int i=0; i<haystack.length(); ++i)
{
haystackHash[i] = (haystack[i]  'a' + 1) * pows[i];
if (i>0) {
haystackHash[i] += haystackHash[i1];
}
}
long long needleHash = 0;
for (int i=0; i<needle.length(); i++) {
needleHash += (needle[i]  'a' + 1) * pows[i];
}
for (int i = 0; i + needle.length()  1 < haystack.length(); i++)
{
long long currentHash = haystackHash[i+needle.length()1];
if (i>0){
currentHash = haystackHash[i1];
}
if (currentHash == needleHash * pows[i])
return i;
}
return 1;
}
Search for the needle in the haystack using hashed values
int strStr(string haystack, string needle) {
if(needle.length() == 0) {
return 0;
}
if(haystack.length() == 0) {
return 1;
}
if(haystack.length() < needle.length()) {
return 1;
}
return rabinKarp(haystack, needle);
}
Complexity Analysis
 Time complexity : $$O(n+m)$$.
The average and the best time complexity of RabinKarp Algorithm is $$O(m+n)$$, but in the worst case it becomes $$O(mn)$$. The worst case occurs when all characters of the haystack are the same with all characters of the needle. But the problem asks for the first match so the time complexity will be $$O(m+n)$$, where n
and m
are lengths of the haystack and the needle respectively.
 Space complexity : $$O(n)$$.
We need n
space to store hashes of the haystack string.