# Repeated String Match

• Why not take the lengths and do two string finds? One in A * ceil(B_length/A_length) and one in A * (ceil(B_length/A_length)+1). If both fail, there is no other way B can be a substring of any string in the set [A_repeated]. O(2) => O(1) time and O(2) => O(1) strings of length max(A, B) space.

Subsequence is a different beast altogether.

• I think the time complexity for Approach 1 should be O(M*N).
Besides, how do you get space complexity O(M+N)
Could you please explain that clearer?

• I guess space for approach 1 should be \$O(N)\$. the string `B` will be concat at most \$N/M+1\$ times, so `S.length` should be at most \$O(M*(N/M)+1)=O(N)\$.

• em .. I just made a mistake before. number of concatenations was right, but the total length of`B` should be `M * (N/M+1)`, and when using big-Oh notation it would be `O(N+M)`

• @g63 Yeah, that's right. What about time complexity for approach 1 ?

• @wqmbisheng
As the java code shows, the `for` segment use time of `O(N/M)` , since `A` would be concatenated at most `N/M + 1` times.
The first `if` segment use an `indexOf` method. I dont know how java's `indexOf` is implemented but i think it's just a brute-force string match like `std::string.find` in cpp. So the first `if` use time of `O(N*S.length)`, while `M*(N/M) <= S.length <= M*(N/M+1)`. so the complexity of this segment is at least `O(N^2)` , at most `O(N*(M+N))`
The second `if` segment looks like the same, but now the `S` is an `M` longer than before. so the inequality becomes `M*(N/M+1) <= S.length <= M*(N/M+2)`, and the complexity is `O(N*(M+N))`.
so the complexity is `O(M*(M+N))` when adding them together.

• For time complexity, the first "if" use time of O(N*(S.length - N)), because when the pointer points to S.length-N + 1,there is no need to go further. The left length of S is certainly smaller than N, so there would be no substring equals to B.
In this way, the time complexity is O(N*M). I don't know whether this is correct.

• It took me more than I care to admit to figure out that this
q = (len(B) - 1) // len(A) + 1
is just a Ceil operation
q = ceil(len(B) / len(A)) :D

• Why choose MOD = 1_000_000_007 ?

• This post is deleted!

• Here is my take at supplementing the logic explained in Approach 1: If B has even a chance of being a sub-string of A, then the first character of B must exist somewhere in A. Actually, if you disregard the order, every character of B should exist somewhere in A, because A is the source of all the characters of S (because S is built out of copies of A). However, since the first character of B must exist in A, and B is a sub-string of S, then the index in S that marks the beginning of that sub-string must lie somewhere between 0 and one less than the length of A, or 0 <= i < len(A). If we want to know the maximum length that S needs to grow to before we can say that B is never going to be a sub-string of A, then we need to look at the case where the first character of B occurs at the greatest possible index of S, namely len(A) - 1.

I realise this is not a complete explanation of Approach 1, but hopefully (if correct) it will at least provide another perspective that some may find useful. I struggled with understanding why we only needed to check up to q + 1 instances of A before we could determine that B was definitely not a sub-string of S. After reading Approach 1 several times and thinking about it for a bit (and writing out a few examples) it suddenly clicked, and I thought I should share my findings. In any case, just writing this out has helped me solidify my own understanding of the approach.

• @ben1205 I believe MOD should be (1) a fairly large number so that hash values are spread out more leading to less collisions, and (2) a number that is co-prime to 'p' so that it is possible to calculate the modular inverse.

• Here is my JavaScript implementation of Approach 2 (Rabin-Karp Rolling Hash). It took me a while to decipher the given Java implementation, so hopefully this code will make it much clearer what the given implementation does.

``````function repeatedStringMatch(A, B) {

if(!A || A.length === 0 || !B || B.length === 0) {
return -1;
}

var q = Math.ceil(B.length / A.length);

var a = new CyclicRollingHash(A, B.length);
var b = new CyclicRollingHash(B, B.length);

var max1 = A.length * q;
var max2 = A.length + max1;

while(a.end < max2) {

if(a.hash === b.hash && isSubstring(A, B, a.start)) {
return a.end < max1 ? q : q + 1;
}

a.roll();
}

return -1;

}

function isSubstring(A, B, offset) {
for(var i = 0; i < B.length; ++i) {
if(B[i] !== A[(offset + i) % A.length]) {
return false;
}
}
return true;
}

function CyclicRollingHash(text, width) {
this.text = text;
this.start = 0;
this.end = width - 1;
this.hash = 0;

this.p = 113;
this.MOD = 1000003; //smaller than 1000000007 so it doesn't take forever to compute inverse
this.pInverse = this.getModularInverse(this.p, this.MOD);
this.power = 1;

for(var i = this.start; i <= this.end; ++i) {
this.hash = (this.hash + this.power * this.text.codePointAt(i % this.text.length)) % this.MOD;
this.power = (this.power * this.p) % this.MOD;
}

this.power = (this.power * this.pInverse) % this.MOD; //reverts last power 'increment' at the end of the hash loop
}

CyclicRollingHash.prototype.roll = function() {
this.hash -= this.text.charCodeAt(this.start++ % this.text.length);
this.hash *= this.pInverse;
this.hash += this.power * this.text.charCodeAt(++this.end % this.text.length);
this.hash %= this.MOD;
}

//https://rosettacode.org/wiki/Modular_inverse#C.2B.2B
//no idea how it works
CyclicRollingHash.prototype.getModularInverse =function(a, b) {
var b0 = b, t, q;
var x0 = 0, x1 = 1;
if(b===1) return 1;
while(a>1) {
q = Math.floor(a/b);
t = b, b = a % b, a = t;
t = x0, x0 = x1 - q * x0, x1 = t;
}
if(x1 < 0) x1 += b0;
return x1;
}

//slower but easier to understand version of getModularInverse
CyclicRollingHash.prototype.getModularInverseNaive = function(num, mod) {
num %= mod;

for(var numInverse = 1; numInverse < mod; ++numInverse) {
result = (num * numInverse) % mod;
if(result === 1) {
return numInverse;
}
}

return -1;
}

//////////////////////////////////

var t = console.assert;
var f = repeatedStringMatch;

function test() {
t(f('abc','ca') === 2);
t(f('abc','cabca') === 3);
t(f('abc','d') === -1);
t(f('abc','abc') === 1);
t(f('','') === -1);
t(f('a','aaaaa') === 5);
t(f('a','') === -1);
t(f('','a') === -1);
t(f('abc','abcabc') === 2);
t(f('aabc','abca') === 2);
t(f('aaabc','aabca') === 2);
t(f('accdc','cabcd') === -1);
t(f('axaxaya','axaya') === 1);
}

test();
``````

• Here is my Solution

class Solution {
public int repeatedStringMatch(String A, String B) {
StringBuilder mainString = new StringBuilder(A);
int i=1;
for(; mainString.length() < (B.length() + 2 * A.length()); i++ ){
if( mainString.toString().contains(B)){
return i;
}else{
mainString.append( A );
}
}

``````    return -1;
}
``````

}

• Here's my java implementation.

``````class Solution {
public int repeatedStringMatch(String A, String B) {
StringBuilder mainString = new StringBuilder(A);
int i=1;
for(; mainString.length() < (B.length() + 2 * A.length()); i++ ){
if( mainString.toString().contains(B)){
return i;
}else{
mainString.append( A );
}
}

return -1;
}
}
``````

• Without considering language specific functions, I think the time complexity in approach 1 should be O(M*N) because:

1. The first brutal force check of B against S[0:] will take N character-by-character matching tries.
2. A second try will need to compare B against S[1:] if the 1st try fails, again will take N character matching tries.
3. The worst situation is that we have to keep going until comparing B against S[M-1:], so the total iteration will be M, times the number of tries in each iteration N, and the final result is M*N character-by-character comparison.

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.