# Find Median From Data Stream

• Just shortening the last solution a bit further...

``````class MedianFinder {
multiset<int> data;
multiset<int>::iterator mid;

public:
data.insert(num);
mid = data.size() == 1 ? data.begin()
: next(mid, data.size() % 2 - (num < *mid));
}

double findMedian() {
return (*mid + *next(mid, 1 - data.size() % 2)) * 0.5;
}
};
``````

• @StefanPochmann Thanks for this! Implicit type conversion FTW.
I am not going to change the solution in the article though. It is better the reader discovers it in the comments.

• @babhishek21 Thanks for the great write-up. Minor nit: I didn't quite follow your section on buckets. My understanding is that you were advocating a model where you maintain a certain number of buckets that could be determined by knowing the statistical distribution of the input (say, values between 1-30). You could create 3 buckets (1-10, 11-20, 21-30) and send the stream values in said buckets. Finally when you're asked to find the median, knowing the size of each bucket will allow you surgically sort 1 specific bucket and leave the rest alone.

If this understanding is correct, then this has nothing to do with reservoir sampling (which is a nifty way to keep k out of N elements handy in a statistically unbiased way).

• @schumpeter You are right. It can be misleading. My bad. But did you read the StackOverflow answer? I think i'll clarify a bit.

EDIT: Fixed.

My understanding is that you were advocating a model where you maintain a certain number of buckets ... will allow you surgically sort 1 specific bucket and leave the rest alone.

Yep. Pretty much. But what if you want to maintain just one magical bucket that solves all your problems?

• @babhishek21 Thanks for your edit. I did read the SO link. While it is implicit, I think in an interview it would be work calling the need for precision explicitly. Ultimately, reservoir sampling is a quick approximation to finding things like median, quantiles etc. but it doesn't guarantee complete accuracy (and often times, that's OK).

• Amazing algorithm! I have never think about use two pointer to solve problem like this.

• @StefanPochmann by not keeping track of 2 iterators findMedian becomes O(logn) right not O(1) since next on a bst is logn if i'm not mistaken. I wanted to go with the 1 pointer approach until i thought about this :(

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.