C++ 0ms O(str1.length*str2.length)


  • 17
    7

    IDEA:

    Given a str2, for each str, we can give a value v to this str such that, after greedily looking through str, our imaginary next step is to find str2[v].
    In our problem, str is always (str1,n), with a given str1, so, we can take one more step and say that for each n, there is a unique v associated to n(i.e t0 (str,n)).

    define a division and a modulo between two strings as follow:

    str/str2=argmax{i, (str2,i) can be obtained by str}
    str%str2=the v mentioned above associated with str.

    All possible values of v is less than str2.size(),
    so (str1,n)%str2 will begin to repeat a pattern after a certain n less than str2.size().
    (the pattern is the same because in the cases with the same v, our situations are exactly the same),
    so is (str1,n)/str2-(str1,n+1)/str2 for the same reason.
    We can therefore precompute a table for all these values with O(str1.length*str2.length).

    (str1,n) can be divided in three parts:

    sth before pattern(A) + pattern parts(B) + sth after pattern(C)

    The pattern does not necessarily begin in the first str1, we shall see if n is great enough so that there can be a pattern.

    The last pattern(C) is not necessarily complete, we need to calculate it separately.

    We can finish in just looking to the precomputed table and doing some simple maths.

    class Solution {
    public:
        int getMaxRepetitions(string s1, int n1, string s2, int n2) {
            vector<int> rapport(102,-1);
            vector<int> rest(102,-1);
            int b=-1;int posRest=0;int rap=0;
            int last=-1;
            rapport[0]=rest[0]=0;//case when n=0
            for(int i=1;i<=s2.size()+1;i++){
                int j;
                for(j=0;j<s1.size();j++){
                    if(s2[posRest]==s1[j]){
                        posRest++;
                        if(posRest==s2.size()){
                            rap++;
                            posRest=0;
                        }
                    }
                }
                for(int k=0;k<i;k++){
                    if(posRest==rest[k]){b=k;last=i;break;}
                }
                rapport[i]=rap;rest[i]=posRest;
                if(b>=0)break;
            }
            int interval=last-b;
            if(b>=n1)return rapport[n1]/n2;
            return ((n1-b)/interval*(rapport[last]-rapport[b])+rapport[(n1-b)%interval+b])/n2;
    //corrected thanks to @zhiqing_xiao and @iaming 
        }
    };
    

  • 0

    Seems to be an interesting solution, but I do not understand.

    Q1:
    "Given a str2, for each str, we can give a value v to this str such that, after greedily looking through str, our imaginary next step is to find str2[v]."
    So v is the index of str2?
    What is the "imaginary next step"?
    So this sentence is trying to update an old value of v to a new value of v?

    Q2:
    "In our problem, str is always (str1,n), with a given str1, so, we can take one more step and say that for each n, there is a unique v associated to n(i.e t0 (str,n))."
    What is the meaning of "associated to"?

    In addition, could you please help explain what is the meaning of "rap", "rapport", and "rest" in the code?

    // after thinking for a while, maybe rap is the repeat of str2, posRest is the index in str2 to match for the next character.

    Thanks!


  • 0
    7

    @zhiqing_xiao
    Sorry for my broken English.
    Q1: Yes, v is the next index of str2 we will look at in greedy algorithm. Like: ("dcab",4) ("abcd",1). For "dcab", v=2, for "dcabdcab", v= 3, for "dcabdcabdcab", v=2
    By this sentence what I wanna say is that, there exists a v for every string.

    Q2: "Associated to" I'm sorry but I'm really in French mode. I suppose that it's "linked to" ?. e.g:
    f(a)=b, f is a function in which we associate b to a. b is therefore the value associated to a.

    Addition:
    Again, I mixed French and English. I can't find the word now.
    a/b=c: c is the "rapport"(result of a division) between a and b;
    By "rest", I mean the remainder after the division.


  • 1

    @70664914

    Thanks for your explanation!

    After your kind explanation, now I understand that, in the outer for loop, "i" means that the string s1 is scanning for the i-th time. The logic of i-th scanning of s1 is:

    for(int j=0;j<s1.size();j++){ // loop over all characters in s1. In theory, we only need to do that at most s2.size()+1 times
        if(s2[posRest]==s1[j]){
            posRest++;
            if(posRest==s2.size()){
                rap++;
                posRest=0;
            }
        }
    }
    rapport[i]=rap;    // we have covered s2 for rap times after we scan s1 for i times
    rest[i]=posRest;   // we reach the posRest-th character in s2 after we scan s1 for i times
    

    Then we try to find whether there are repeating patterns:

    for(int k=0;k<i;k++){ // search in previous results
       if(posRest==rest[k]) // find a previous result that also ends up with the posResult-th character of s2
                            // and the previous result is obtained when we had scanned s1 for k times.
                            // Herefore, we found a repeat pattern of length s2.size()
                            // That is: cover s1 (i-k) times can also cover (s2[posRest .... end], s2[0 ... posRest-1]) for integer times
       {b=k;last=i;break;}
    }
    

    After the repeat patten is found, b is not longer -1, but becomes a >=0 number.

     pattern part (A) [the beginning patten]   <---->   rapport[b]
     pattern part (B) [repeating middle pattern]     <---->    ((n1-b)/interval*(rapport[last]-rapport[b])
     pattern part (C) [remaining pattern]   <---->     rapport[(n1-b)%interval])/n2
    

    Oh, I know why the pattern B always exist if we s1 can repeat for infinite number of times.
    First, the value of v is the index of the vector rest. As you said, "All possible values of v is less than str2.size()," due to the PigeonHole Principle, after generating s2.size()+1 of such values of v, there must exist two v's, say v1 and v2, such that v1==v2. So the repeat patten always exists.

                if (first >= n1) { // This part means that we only have patten A, but no patten B and C
    		return quotients[n1];
    	}

  • 8

    Uh... I think I found an error in your logic.

    Let (rap, posRest) mean that we cover a string (s1 or s2) for rap times, and reach posRest-th character now.

    • pattern A:
      s1: (0, 0) -> (b, s1.size())
      s2: (0, 0) -> (rapport[b], posRest)

    • pattern B:
      s1: (b+1, 0) -> (last, s1.size()) for a unit, b~b1-(n1-b)%interval for total
      s2: (rapport[b], posRest) -> (rapport[b]+(n1-b)/interval*(rapport[last]-rapport[b]), posRest) for total

    • pattern C:
      s1: (b1-(n1-b)%interval+1, 0) -> (n1, s1.size())
      s2: (rapport[b]+(n1-b)/interval*(rapport[last]-rapport[b]), remainders[last]) -> (RESULT, s2.size())

    When calculating the pattern C, s2 actually start from posRest, rather than 0. So directly use the value rapport[(n1-b)%interval] maybe incorrect.

    A linked list that may have a loop:
    (There are at most s2.size() nodes in the shape, since every idx2 should be different. If idx2 becomes the same, we found a loop.

        (pass2=0, idx2=0) ----->  (pass2, idx2) -----> ... -----> (prevPass2, idx2) -----> ... ----\
                                                                        /|\                         |
                                                                         |                          .
                                                                   (pass2, idx2)                    .
                                                                        /|\                         .
                                                                         |                          |
                                                                          \---------- ... ----------/
    

    Code: C++

    class Solution  {
    public:
    	int getMaxRepetitions(string s1, int n1, string s2, int n2)  {
    
    		// Let (pass2, idx2) mean that we cover a string s2 for pass2 times,
    		// and reach idx2-th character.
    		// (pass2s, idx2s) stores the that, for pass1-th pass of s1,
    		// we cover a string s2 for pass2s[pass1] times,
    		// and reach idx2s[pass1]-th character.
    
    		vector<int> pass2s(s2.size() + 1u, -1);
    		vector<int> idx2s(s2.size() + 1u, -1);
    		pass2s[0] = 0;
    		idx2s[0] = 0; // at the beginning, we are at (0, 0)
    		// we will let all elements in idx2s be different
    		// according to pigeonhole principle,
    		// we only need s2.size()+1 to find two elements 
    		// that are identical to each other.
    
    		int pass2 = 0;
    		int idx2 = 0;
    
    		for (int pass1 = 1; pass1 <= n1; ++pass1) {
    			// Due to pigeonhole principle
    			// we are sure to break within O(s2.size()) iterations
    
    			for (int idx1 = 0; idx1 < s1.size(); ++idx1) { // scan s1
    				if (s2[idx2] == s1[idx1]) {
    					++idx2;
    					if (idx2 == s2.size()) {
    						idx2 = 0;
    						++pass2;
    					}
    				}
    			}
    			pass2s[pass1] = pass2;
    			idx2s[pass1] = idx2;
    
    			// try to find the repetitive part
    			for (int prevPass1 = 0; prevPass1 < pass1; ++prevPass1) {
    				if (idx2s[prevPass1] == idx2) {
    					// prevRepeat1 and pass1 share the same idx2,
    					// repetitive part is found
    
    					int repeatCount = (n1 - prevPass1) / (pass1 - prevPass1);
    					int remainPass1count = (n1 - prevPass1) % (pass1 - prevPass1);
    
    					int prefixPass2Num = pass2s[prevPass1]; // prefix part
    					int repetitivePass2Num = repeatCount * (pass2s[pass1] - pass2s[prevPass1]); // repetitive part
    					int suffixPass2Num = pass2s[prevPass1 + remainPass1count] - pass2s[prevPass1];
    					
    					int overallPass2Num = prefixPass2Num + repetitivePass2Num + suffixPass2Num;
    					return overallPass2Num / n2;
    				}
    			}
    		}
    
    		// no repeative part found
    		return pass2s[n1] / n2;
    	}
    };

  • 0
    7

    @zhiqing_xiao
    Thanks for your correction!
    That's so kind of you, and your explanation is clear.

    As you point out, my code should be false. As a counter example,
    s1="ecbafedcba";n1=4;s2="abcdef";n2=1;
    This really takes me some time to find.

    By the way, I once doubted the necessity to separate the case prePass>=n1, so, there i find a counter example:
    s1="ecbafedcba";n1=3;s2="abcdef";n2=1;

    Thanks to your help!


  • 0
    I

    Shouldn't this be rapport[n1]/n2?

    if(b>=n1)return rapport[n1];


  • 0
    7

    @iaming Yes, you're right. Thanks. (Seriously, how can I AC that? haha)


  • 2
    I

    Java version.

    int l1 = s1.length(), l2 = s2.length();
    int[] offsets = new int[l2 + 1], reps = new int[l2 + 1];
    int lo = -1, hi = 0, cnt = 0;
    for (int i = 1, offset = 0; i <= l2 && i <= n1; ++i) {
        for (int j = 0; j < l1; ++j) {
            if (s1.charAt(j) != s2.charAt(offset)) continue;
            offset++;
            if (offset == l2) {
                offset = 0;
                cnt++;
            }
        }
        for (int j = 0; j < i; ++j) {
            if (offset == offsets[j]) {
                lo = j; // cycle found [lo, hi)
                hi = i;
                break;
            }
        }
        if (lo >= 0) break;
        offsets[i] = offset;
        reps[i] = cnt;
    }
    if (lo < 0) return cnt / n2;
    return ((n1 - lo) / (hi - lo) * (cnt - reps[lo]) + reps[lo + (n1 - lo) % (hi - lo)]) / n2;
    

  • 0
    C

    @70664914 said in C++ 0ms O(str1.length*str2.length):

    ((n1-b)/interval*(rapport[last]-rapport[b])+rapport[(n1-b)%interval+b])/n2;

    What is the logic here for rapport[(n1-b)%interval+b]?
    I think rapport[b] should be good enough for this problem.


  • 0
    7

    @cz You may see it in two parts:
    rapport[b] and rapport[(n1-b)%interval+b]-rapport[b]
    The first part is the part before the beginning of the repeating pattern, namely A as I called it;
    the second part is C, which means the part left after we counted as many as possible complete patterns. (Correct English I hope)

    Only rapport[b] is gonna be false as I have given a counter-example:
    s1="ecbafedcba";n1=4;s2="abcdef";n2=1;

    Hope this ma help you


  • 0
    This post is deleted!

  • 0
    M

    The same idea, but the code is more concise.

    class Solution {
    public:
        int getMaxRepetitions(string s1, int n1, string s2, int n2)
    	{
    		if((s1.size()*n1)<(s2.size()*n2))
    			return 0;
    			
    		vector<int> contain, remain;
    		int count=0, j=0;
    		
    		while(remain.size()<=1 || remain.front()!=remain.back())
    		{
    			for(int i=0; i<s1.size(); i++)
    			{
    				if(s1[i]==s2[j])
    				{
    					j++;
    					if(j==s2.size())
    					{
    						count++;
    						j=0;
    					}
    				}
    			}
    			contain.push_back(count);
    			remain.push_back(j);
    		}
    		
    		int M=contain.size()-1, N=contain.back()-contain.front();		
    		int result=((n1-1)/M*N+contain[(n1-1)%M])/n2;
    
    		return result;
    	}
    					
    };
    

  • 0
    B

    Brilliant idea for finding repeated pattern! Here is my Java version w/ a few comments :)

    public class Solution {
        public int getMaxRepetitions(String s1, int n1, String s2, int n2) {
            char[] str1 = s1.toCharArray();
            char[] str2 = s2.toCharArray();
            int l1 = str1.length, l2 = str2.length;
            int[] repCnt = new int[l2 + 1];
            int[] nextIdx = new int[l2];
            Arrays.fill(nextIdx, -1);
            int cnt = 0, cycleStart = -1, cycleEnd = -1;
            for (int i = 0, s2Idx = 0; i <= l2 && i < n1; i++) { // currently processing ith occurrence of str1
                for (int j = 0; j < l1; j++) { //  find the first char in str1 to match str2[s2Idx]
                    if (str1[j] == str2[s2Idx]) {
                        if (++s2Idx == l2) { // reach the end of str2, increment cnt
                            s2Idx = 0;
                            cnt++;
                        }
                    }
                }
                for (int k = 0; k < i; k++) {
                    if (nextIdx[k] == s2Idx) { // find the cycle from (str1, k) -> (str1, i)
                        cycleStart = k;
                        cycleEnd = i;
                        break;
                    }
                }
                if (cycleStart >= 0) {break;}
                repCnt[i] = cnt;
                nextIdx[i] = s2Idx;
                //System.out.println("i = " + i + ", repCnt = " + cnt + ", nextIdx = " + s2Idx);
            }
            /*
            0 ... 1 ... 2 ... ... ... cycleStart ... cycleStart+1 ... ... ... cycleEnd ............ n1 - 1
                                                          |                        |
                                                          | .......................|
            |<-         part1                   ->|<-         part2                 ->|<-   part3      ->|     
            */
            // if n1 is large enough, it can be divided into three parts:
            // Part 1: before cycle i = [0, cycleStart)
            // Part 2: in the cycle [cycleStart, cycleEnd) * ?
            // Part 3: remaining..., n1 - 1]
            
            // if n1 is too small
            if (cycleStart < 0) {
                return repCnt[n1] / n2;
            }
            int part1 = repCnt[cycleStart];
            int part2 = (n1 - cycleStart - 1) / (cycleEnd - cycleStart) * (cnt - repCnt[cycleStart]);
            int part3 = repCnt[(n1 - cycleStart - 1) % (cycleEnd - cycleStart) + cycleStart] - repCnt[cycleStart];
            return (part1 + part2 + part3) / n2;
        }
    }
    

  • 0
    B

    @zhiqing_xiao linearly check repeated idx2 would have a worst case of O(n1^2) check

    so instead of having idx2s[] to store pass1 vs idx2, you can use idx2 vs pass1, that is use idx2 as index. If you find a idx2 already have a pass1, you know you got a repeated pattern.


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.