# Java solution with O(s1.length^2 * s2.length)

• To determine if a string `str2` is contained in another string `str1`, there is a naive `O(n)` solution with `n = str1.length`: from start to end, simply match each character in `str2` greedily with those in `str1`. If there is a such match for all characters in `str2`, we conclude `str2` is contained in `str1`, otherwise it is not.

To find the number of `str2` contained in `str1`, we can proceed similarly except now we need to keep track of the number of matches found for `str2`. Still this can be done in `O(n)` time.

So for our problem, "to find the maximum number of `S2=[s2,n2]` contained in `S1=[s1,n1]`", we have the following naive solution:

``````public int getMaxRepetitions(String s1, int n1, String s2, int n2) {
int M = 0, p = 0, q = 0;

for (int i = 0; i < n1; i++) {
for (int j = 0; j < s1.length(); j++) {
if (s1.charAt(j) == s2.charAt(q)) {
q++;

if (q == s2.length()) {
p++;
q = 0;
}

if (p == n2) {
M++;
p = 0;
}
}
}
}

return M;
}
``````

And as you would expect, this failed the extreme case when `s1.length = 100` and `n1 = 10^6` (i.e., `S1.length = 10^8`). So let's see how we can do better.

The key idea here is to take advantage of the fact that `S1` consists of repeating strings (i.e. `s1`). The naive solution ignored this information and did a blind scan for the whole string, which led to the poor time performance.

First we will assume `s1` contains at least once of each distinct character in `s2`, otherwise no `S2` can be contained in `S1`. Second, instead of finding the number of `S2` contained in `S1`, we will find the number of `s2` contained in `S1`. The former can be obtained by dividing the latter by `n2`.

To implement the above idea, here are some observations:

1. Each character with index `i` (`0 <= i < N1`) in string `S1` will correspond to some index `j` in string `s1`, with `j = i % s1.length` and `N1 = S1.length`.

2. Each match of the whole string `s2` in `S1` will end at some index in `S1`. Starting from the first match, let's label the end index for each match in `S1` as `i1, i2, i3 ...`, and the corresponding indices in `s1` as `j1, j2, j3 ...`.

3. Let `ip` and `iq` be the end indices in `S1` of the first two matches whose corresponding indices in `s1`, `jp` and `jq`, are equal. Let `S1(ip, iq]` denote the substring between them, `l = iq - ip` as its length and `m` as the total number of `s2` contained in it. Then the substring `S1(ip, N1)` can be constructed by concatenating `t` times the substring `S1(ip, iq]` plus possibly some residual part `S1(r, N1)`, where `t = (N1 - ip - 1) / l` (integer part) and `r = ip + t * l + 1`. And the total number of `s2` contained in the substring `S1(ip, r)` will simply be the product `m * l`. Also the total length of the residual substring is less than `l`.
(Note: for the substring notations, `(` or `)` means exclusive, `[` or `]` inclusive)

4. It can be shown that if the two indices `ip` and `iq` in part 3 exist, they can be found in `O(s1.length^2 * s2.length)` time. The reasoning is as follows: first since `s1` contains at least once of each distinct character in `s2`, starting from any index in `S1`, it will take at most `s1.length * (s2.length + 1)` characters to find a match of `s2` in `S1` (consider the worst case scenario in which it takes a whole `s1` string to match each single character in `s2`). Second there are at most `s1.length` matches of `s2` in `S1` whose corresponding indices in `s1` are all different, as each match must take one index in `s1` and we have only up to `s1.length` different positions available. Therefore the index in `s1` of the `(s1.length + 1)`-th match must coincide with some previous match. In total `ip` and `iq` will be found within `s1.length^2 * (s2.length + 1)` characters. This also sets the upper limit on `l` where `l = iq - ip`, which in turn bounds the total length of the residual substring in part 3.

5. The total number of `s2` contained in `S1` can be obtained from three parts: the number contained in substring `S1[0, ip]`, the number contained in substring `S1(ip, r)` and lastly the number contained in the residual substring `S1[r, N1)`. The number in the second part can be obtained in constant time (simple mathematical computation) while those in the first and third parts can both be obtained in `O(s1.length^2 * s2.length)` time. Therefore our total time complexity will be at most `O(s1.length^2 * s2.length)`.

Here is the java program for the improved solution with brief explanation as follows:

1. First make sure `s1` contains at least once of each distinct character in `s2`.

2. For each match of `s2` in `S1`, we will record two piece of information: its end index in `S1` and the total number of matches up to this end index. Since we will have at most `s1.length` "distinct" matches, we will use an array of the same length as `s1` to hold matches found so far. The element in the array will be another array of length 2, whose first element denotes the end index and second element the total matches.

3. We will proceed in the way described in the naive solution to identify possible matches. If we do not make it all the way to the end of `S1`, then a repeating match is found. We can obtain `l`, `m` and `t` to compute the number of `s2` contained in the second part as well as scanning the residual substring to get that in the third part. The final answer will be the total number in all three parts divided by `n2`.

``````public int getMaxRepetitions(String s1, int n1, String s2, int n2) {
Set<Character> set = new HashSet<>(s1.length());

for (char ch : s1.toCharArray()) set.add(ch);

for (char ch : s2.toCharArray()) {
if (!set.contains(ch)) return 0;
}

int[][] arr = new int[s1.length()][2];

for (int i = 0; i < s1.length(); i++) {
arr[i][0] = -1;
}

int i = 0, j = 0, k = 0, c = 0, N1 = n1 * s1.length();

for (; i < N1; i++, j = i % s1.length(), k %= s2.length()) {
if (s1.charAt(j) == s2.charAt(k) && ++k == s2.length()) {
c++;
if (arr[j][0] >= 0) break;
arr[j][0] = i;
arr[j][1] = c;
}
}

if (i < N1) {
int l = i - arr[j][0];
int m = c - arr[j][1];
int t = (N1 - arr[j][0] - 1) / l;
c = arr[j][1] + m * t;
i = arr[j][0] + l * t + 1;

for (j = i % s1.length(), k = 0; i < N1; i++, j = i % s1.length(), k %= s2.length()) {
if (s1.charAt(j) == s2.charAt(k) && ++k == s2.length()) {
c++;
}
}
}

return c / n2;
}
``````

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.