public class Solution {
int[] nums;
Random rnd;
public Solution(int[] nums) {
this.nums = nums;
this.rnd = new Random();
}
public int pick(int target) {
int result = 1;
int count = 0;
for (int i = 0; i < nums.length; i++) {
if (nums[i] != target)
continue;
if (rnd.nextInt(++count) == 0)
result = i;
}
return result;
}
}```
Simple Reservoir Sampling solution


@yidongwang Actually it does.
public int pick(int target) {
int result = 1;
int count = 0; // to record how many targets in the array
for (int i = 0; i < nums.length; i++) {
if (nums[i] != target)
continue;
/*
For the nth target, ++count is n. Then the probability that rnd.nextInt(++count)==0 is 1/n. Thus, the probability that return nth target is 1/n.
For (n1)th target, the probability of returning it is (n1)/n * 1/(n1)= 1/n.
*/
if (rnd.nextInt(++count) == 0)
result = i;
}
return result;
}

@Chuqi
thank you very much,I have read the proof of "Reservoir Sampling".
yes, It is equal probability of returning, I'm wrong.
thanks


@iambright Take a look at this discussion https://discuss.leetcode.com/topic/58322/whatonearthismeantbytoomuchmemory
It seems like some O(n) memory solutions are accepted and some are not.
So I think originally problem was expected to be solved with O(1) memory solution.

public class Solution { // sort + binary search? You can't really sort it, as it is int[] nums; public Solution(int[] nums) { this.nums = nums; } public int pick(int target) { int result = 0; int cnt = 0; Random rand = new Random(); for (int i = 0; i < nums.length; ++i) { if (nums[i] == target) { cnt ++; if (rand.nextLong() % cnt == 0) { result = i; } } } return result; } }

@dettier said in Simple Reservoir Sampling solution:
Your Java code doesn't get accepted in C++:
class Solution { vector<int> v; vector<int>& vect = v; int n; public: Solution(vector<int> nums){ vect = nums; n = nums.size(); } int pick(int t) { int count = 0, res = 1; for(int i = 0; i < n; i++) { if (vect[i] == t) { count += 1; if (not (rand() % count)) res = i; } } return res; } };

@liu971 Hi, this is because we need to save current element with probability 1 / (total # of encountered candidates). So it's 100% for first target element, 50% for the second one, and so on...
Check out wikipedia article, it explains it pretty well (sample size 1 example).
https://en.wikipedia.org/wiki/Reservoir_sampling

To those who don't understand why it works. Consider the example in the OJ
{1,2,3,3,3} with target 3, you want to select 2,3,4 with a probability of 1/3 each.2 : It's probability of selection is 1 * (1/2) * (2/3) = 1/3
3 : It's probability of selection is (1/2) * (2/3) = 1/3
4 : It's probability of selection is just 1/3So they are each randomly selected.

@xinzh1010163.com I literally just showed why it works above. You can't test randomness.

@lekzeey Thank you. I know how to prove it. Just want to know how the design the test case for randomness. Some one asked me about it in an interview.

@xinzh1010163.com I think you can test edges cases and reasonable values in randomness like Leetcode does. E.g. if it's just one index then only that index would be returned or it should be in a certain range 2,3,4 etc. But testing randomness in general is pretty hard, even if you take numerous samples because they wouldn't exactly fit the pmf of the random variable.

hello!
I have the same question about the line : if (rnd.nextInt(++count) == 0),
I know it is the right answer, but I still don't understand why it is not count++?
This is what I think:
The first time we execute this if statement will be : (rnd.nextInt(1) == 0), am I right?
So we will just have 1/2 probability to make result = i,, not 1.
Could anyone tell me why am wrong?
Thanks a lot.

