Repeated String Match


Why not take the lengths and do two string finds? One in A * ceil(B_length/A_length) and one in A * (ceil(B_length/A_length)+1). If both fail, there is no other way B can be a substring of any string in the set [A_repeated]. O(2) => O(1) time and O(2) => O(1) strings of length max(A, B) space.
Subsequence is a different beast altogether.


@wqmbisheng
As the java code shows, thefor
segment use time ofO(N/M)
, sinceA
would be concatenated at mostN/M + 1
times.
The firstif
segment use anindexOf
method. I dont know how java'sindexOf
is implemented but i think it's just a bruteforce string match likestd::string.find
in cpp. So the firstif
use time ofO(N*S.length)
, whileM*(N/M) <= S.length <= M*(N/M+1)
. so the complexity of this segment is at leastO(N^2)
, at mostO(N*(M+N))
The secondif
segment looks like the same, but now theS
is anM
longer than before. so the inequality becomesM*(N/M+1) <= S.length <= M*(N/M+2)
, and the complexity isO(N*(M+N))
.
so the complexity isO(M*(M+N))
when adding them together.

For time complexity, the first "if" use time of O(N*(S.length  N)), because when the pointer points to S.lengthN + 1,there is no need to go further. The left length of S is certainly smaller than N, so there would be no substring equals to B.
In this way, the time complexity is O(N*M). I don't know whether this is correct.

Here is my take at supplementing the logic explained in Approach 1: If B has even a chance of being a substring of A, then the first character of B must exist somewhere in A. Actually, if you disregard the order, every character of B should exist somewhere in A, because A is the source of all the characters of S (because S is built out of copies of A). However, since the first character of B must exist in A, and B is a substring of S, then the index in S that marks the beginning of that substring must lie somewhere between 0 and one less than the length of A, or 0 <= i < len(A). If we want to know the maximum length that S needs to grow to before we can say that B is never going to be a substring of A, then we need to look at the case where the first character of B occurs at the greatest possible index of S, namely len(A)  1.
I realise this is not a complete explanation of Approach 1, but hopefully (if correct) it will at least provide another perspective that some may find useful. I struggled with understanding why we only needed to check up to q + 1 instances of A before we could determine that B was definitely not a substring of S. After reading Approach 1 several times and thinking about it for a bit (and writing out a few examples) it suddenly clicked, and I thought I should share my findings. In any case, just writing this out has helped me solidify my own understanding of the approach.

@ben1205 I believe MOD should be (1) a fairly large number so that hash values are spread out more leading to less collisions, and (2) a number that is coprime to 'p' so that it is possible to calculate the modular inverse.

Here is my JavaScript implementation of Approach 2 (RabinKarp Rolling Hash). It took me a while to decipher the given Java implementation, so hopefully this code will make it much clearer what the given implementation does.
function repeatedStringMatch(A, B) { if(!A  A.length === 0  !B  B.length === 0) { return 1; } var q = Math.ceil(B.length / A.length); var a = new CyclicRollingHash(A, B.length); var b = new CyclicRollingHash(B, B.length); var max1 = A.length * q; var max2 = A.length + max1; while(a.end < max2) { if(a.hash === b.hash && isSubstring(A, B, a.start)) { return a.end < max1 ? q : q + 1; } a.roll(); } return 1; } function isSubstring(A, B, offset) { for(var i = 0; i < B.length; ++i) { if(B[i] !== A[(offset + i) % A.length]) { return false; } } return true; } function CyclicRollingHash(text, width) { this.text = text; this.start = 0; this.end = width  1; this.hash = 0; this.p = 113; this.MOD = 1000003; //smaller than 1000000007 so it doesn't take forever to compute inverse this.pInverse = this.getModularInverse(this.p, this.MOD); this.power = 1; for(var i = this.start; i <= this.end; ++i) { this.hash = (this.hash + this.power * this.text.codePointAt(i % this.text.length)) % this.MOD; this.power = (this.power * this.p) % this.MOD; } this.power = (this.power * this.pInverse) % this.MOD; //reverts last power 'increment' at the end of the hash loop } CyclicRollingHash.prototype.roll = function() { this.hash = this.text.charCodeAt(this.start++ % this.text.length); this.hash *= this.pInverse; this.hash += this.power * this.text.charCodeAt(++this.end % this.text.length); this.hash %= this.MOD; } //https://rosettacode.org/wiki/Modular_inverse#C.2B.2B //no idea how it works CyclicRollingHash.prototype.getModularInverse =function(a, b) { var b0 = b, t, q; var x0 = 0, x1 = 1; if(b===1) return 1; while(a>1) { q = Math.floor(a/b); t = b, b = a % b, a = t; t = x0, x0 = x1  q * x0, x1 = t; } if(x1 < 0) x1 += b0; return x1; } //slower but easier to understand version of getModularInverse CyclicRollingHash.prototype.getModularInverseNaive = function(num, mod) { num %= mod; for(var numInverse = 1; numInverse < mod; ++numInverse) { result = (num * numInverse) % mod; if(result === 1) { return numInverse; } } return 1; } ////////////////////////////////// var t = console.assert; var f = repeatedStringMatch; function test() { t(f('abc','ca') === 2); t(f('abc','cabca') === 3); t(f('abc','d') === 1); t(f('abc','abc') === 1); t(f('','') === 1); t(f('a','aaaaa') === 5); t(f('a','') === 1); t(f('','a') === 1); t(f('abadc','bcaba') === 1); t(f('abc','abcabc') === 2); t(f('aabc','abca') === 2); t(f('aaabc','aabca') === 2); t(f('accdc','cabcd') === 1); t(f('axaxaya','axaya') === 1); } test();

Here is my Solution
class Solution {
public int repeatedStringMatch(String A, String B) {
StringBuilder mainString = new StringBuilder(A);
int i=1;
for(; mainString.length() < (B.length() + 2 * A.length()); i++ ){
if( mainString.toString().contains(B)){
return i;
}else{
mainString.append( A );
}
}return 1; }
}

Here's my java implementation.
class Solution { public int repeatedStringMatch(String A, String B) { StringBuilder mainString = new StringBuilder(A); int i=1; for(; mainString.length() < (B.length() + 2 * A.length()); i++ ){ if( mainString.toString().contains(B)){ return i; }else{ mainString.append( A ); } } return 1; } }

Without considering language specific functions, I think the time complexity in approach 1 should be O(M*N) because:
 The first brutal force check of B against S[0:] will take N characterbycharacter matching tries.
 A second try will need to compare B against S[1:] if the 1st try fails, again will take N character matching tries.
 The worst situation is that we have to keep going until comparing B against S[M1:], so the total iteration will be M, times the number of tries in each iteration N, and the final result is M*N characterbycharacter comparison.