My interpretation/proof of the Cyclic Replacements Method in Editorial Solution


  • 6

    I find the Cyclic Replacements method in the editorial solution quite clever and spent some time learning it. On the other hand, I am not quite satisfied with the explanation and proof provided there. Thus I am elaborating on my way of relating to this algorithm here.

    First, I am putting the code directly copied from the editorial solution here for reference:

    public class Solution {
        public void rotate(int[] nums, int k) {
            k = k % nums.length;
            int count = 0;
            for (int start = 0; count < nums.length; start++) {
                int current = start;
                int prev = nums[start];
                do {
                    int next = (current + k) % nums.length;
                    int temp = nums[next];
                    nums[next] = prev;
                    prev = temp;
                    current = next;
                    count++;
                } while (start != current);
            }
        }
    }
    

    The main idea is to put each entry(denoted by its index) to its correct position after k shifts.
    First consider how you would do this in a naive way. (assuming k = k % n is already done).

    for (int i = 0; i < nums.length; i++) {
        put nums[i] to position[(i + k) % n] //cyclic shift
    }
    

    But this naive way has a problem, when you reach i + k, this entry is already gone: overriden by the iteration of i.
    To avoid this problem, we could buffer nums[i + k] before we shift nums[i], but such a modification, albeit feasible for the problem, would fail the O(1) space requirement: you need O(n - k) space to buffer the entries.

    This Cyclic Replacements solution, on the other hand, overcomes this problem: only one buffer variable prev is shared for all shifts of all n elements(although the code declares new current, prev, etc. in each iteration, JVM optimization can handle that, or you can always just make these variables global to the loop block anyway).

    Lemma: This algorithm visits each entry/index of nums exactly once. During the visit, the algorithm shifts the entry to the correct position.
    Proof:

    Case 1: n % k != 0

    if n % k != 0, the outer loop will only execute one iteration(start == 0) before the algorithm finisheds.
    If n % k != 0, the inner do-while loop will only execute when count == n. Consider what this inner loop does: for example, in the case of [1,2,3,4,5,6,7] and k = 2, it essentially does something like 1 -> 3, 3 -> 5, 5 -> 7, 7 -> 2, 2 -> 4, 4 -> 6, 6 -> 1, and thenstart == current and count == n == 7. We will then have to try to get to the second iteration of the outer loop, which fails due to count == n. Thus the outer loop only executes one iteration.
    Now let's abstract: when n % k != 0, we start from [0] and in this one iteration, we can shift all n elements (to its perspectively correct position). This is because each current in the do-while loop will all be distinct indices (other than the last one that reaches start and ends the loop). This is proven by contradiction: suppose there are two iterations (of the inner do-while) where we have the same current. Then

    current = (current + k) % n (done t times for some t) would end up at current again.
    (current + t * k) % n = current
    

    Since current < n, we have

    current + t * k % n = current
    t * k % n = 0
    

    I think this can actually leads to n % k = 0 now (I tried to give a rigid explanation but my math education is just too ancient to recall. I do find it possible to do if expressing n as a * k + b with b < a and prove, but you get the idea.)

    This is a contradiction with this case's assumption.

    Thus conclusion: during the n iterations of the inner do-while loop, current never duplicates itself.
    From this conclusion we know that, when the inner and outer loop exits, we must have already iterated all n indices of nums: there are only n distinct ones, and our iteration never duplicates itself.

    Case 2: n % k == 0

    For each start value of the outer loop, it is obvious to see that the inner loop does n / k iterations, visiting one distinct indice in each iteration.
    That is, each start visits n / k distinct indices.
    Then we increment start. How many values of start can we have (or what is the number iterations of the outer loop)? It has to be k, because in each outer loop, count get incremented n / k no matter what (even if there are duplicates across different outer loop iterations/start values, which we shall debunk later). So the domain of start is 0..k-1. marker here

    We are now sure that the entire algorithm will visit n / k * k = n indices, but are the guaranteed to be distinct?
    Yes, and this is proven by contradiction. Suppose there exists two indices idx1 and idx2, each corresponding to start == s1 and start == s2 in the outer loop. Then we have:

    idx1 = idx2
    (s1 + a * k) % n = (s2 + b * k) % n
    

    for some a and b. We also know that s1 != s2 (it's trivially obvious that you can't have duplicate indices within the same outer iteration/startvalue).
    We can deduce that

    (s1 - s2) % n = (b - a) * k % n
    s1 - s2 = (b - a) * k + t * n               //for some t
    |s1 - s2| ≥ k
    

    Because

    • if t != 0, then |s1 - s2| ≥ n > k;
    • if t == 0, then s1 - s2 = (b - a) * k, and we know s1 - s2 ≠ 0, so b ≠ a, thus we also have |s1 - s2| ≥ k.

    This is a contradiction with the previous conclusion of start's domain being 0..k-1.

    Thus the n indices visited in the algorithm will also be distinct.
    End of Proof

    I think this lemma is enough for understanding the correctness of this algorithm. There is also a nice solution here using a similar idea. I would learn that one as well if I were you.

    Here is a trace to help your understanding:


  • 0
    S

    This explanation deserves more upvotes :)


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.