Generate uniform random integer


  • 0
    H

    Problem: given function of rand3() which return uniformly random int number of [1,2,3], write a random function rand4(), which return uniformly random integer of [1,2,3,4]

    How to test it?
    As follow up, I was asked about how to test rand4() function, to verify it's truly random.
    My thought is to run rand4() for 1K times, and collect the frequency of [1,2,3,4], and then run rand4() 10K times and collect frequency, then run rand4() 100K time ... to see if the frequency of each number between [1,2,3,4] converge to 25%.

    There is a scenario like 1,2,3,4,1,2,3,4,1,2,3,4 ... ... like round robin generation. it would pass the convergence test I mentioned above, but it's not uniformly random. So does any pattern that shows a deterministic pattern, like 1,2,3,4,4,3,2,1 ...

    Any idea about how to test rand4() is truely uniformly random?


  • 1
    T

    Convert 1/4, 2/4, 3/4, and 1 to 3 based number:

    0.25[10] = 0.(02)[3]

    0.50[10] = 0.(11)[3]

    0.75[10] = 0.(20)[3]

    1.00[10] = 1.00[3]

    Then construct a FSM, to simulate the generating of the 3 based number, and map it to 1, 2, 3, 4.

    Codes:

    function rand3() {
      return ((Math.random() * 3) ^ 0) + 1;
    }
    
    var stateList = [
      /*  0 */ [0,  5,  6,  7],
      /*  1 */ [],
      /*  2 */ [],
      /*  3 */ [],
      /*  4 */ [],
      /*  5 */ [0,  1,  1,  8],
      /*  6 */ [0,  2,  9,  3],
      /*  7 */ [0, 10,  4,  4],
      /*  8 */ [0,  5,  2,  2],
      /*  9 */ [0,  2,  6,  3],
      /* 10 */ [0,  3,  3,  7],
    ];
    
    function rand4() {
      var current = 0
        while (true) {
          current = stateList[current][rand3()];
          if (current <= 4) return current;
        }
    }
    
    function counter(f, range, count) {
      var nums = []; while (range--) nums.push(0);
      while (count--) nums[f()]++;
      return nums;
    }
    
    print(counter(rand4, 5, 10000))
    

  • 6

    This problem could be solved using Rejection Sampling. Please refer to my article for more information: http://articles.leetcode.com/2010/11/rejection-sampling.html


  • 0

    This is based on ts's answer. Like ts, I consider the interval [0,1] and use 0.25, 0.5 and 0.75 as borders. In base 3 they're 0.020202..., 0.111111... and 0.202020.... Since rand3 gives us 1,2,3 instead of 0,1,2, the repeating parts of the three borders become 131313..., 222222... and 313131.... They all start differently, so use one rand3 to tell us which border to watch, and then keep calling rand3 as long as we stay on that border. Staying on the first border for example means alternatingly getting 1 and 3. If we ever clearly deviate from the border, return where we went.

    def rand4():
        r = b = rand3()
        while True:
            b = 4 - b
            d = rand3()
            if d != b:
                return r + (d > b)
    

    My b tells the border digit. I can get the next one by using b = 4 - b because in all three borders, the sum of a digit and the following digit is always 4.


  • 0

    Very nice idea, although it took me a bit to understand how it works. I think listing 1.00 is misleading, as you're actually only using 0.25, 0.5 and 0.75 (as borders). Your states 6 and 9 are btw equivalent, so you could get rid of one of them.

    And once I had understood it, I turned it into a pretty short version (see my answer).


  • 0

    Nice article. I had thought of that method before, but never bothered to analyze it. Here's a short version for the requested rand3->rand4 case:

    def rand4():
        return (rand3() * 3 + rand3() - 3) % 5 or rand4()
    

    I'm surprised by the efficiency. I analyzed both this and the solution in my answer, and this wins with 2.25 calls of rand3 on average, compared to 2.5 calls for the other method. Since that other method doesn't throw anything away, I thought it would be better...


  • 0
    H

    Thank you, That's a very good article!


  • 1

    Any idea about how to test rand4() is truely uniformly random?

    There are probably better ways, but... you could run rand4 a million times, store the results in a 1MB file, and let a good compression program compress it. It should result in about 250KB. If there are easy patterns like your examples or if the distribution is significantly non-uniform, it will be significantly smaller.


  • 0
    N

    rand4() above has a bug an can return between 0-4 instead of 1-4.
    eg: first rand3() call returns 2
    second rand3() call returns 2 this results in:
    (2*3 + 2 - 3)%5= (6-1)%5= 0


  • 0

    @najikhalaf No, it's correct. It can't return 0. If the left side of the or is 0, then the right side will be evaluated and returned.


  • 0
    N

    Ah my bad, thank you for clarifying. That's what happens when one tries to read code at midnight :p


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.