Am I the only person who don't know why median could give shortest distance?


  • 47
    T

    When I first saw this question, intuitively I know shortest meeting point should be found in two separate dimension, however, even if on 1-D, how could I find the shortest meeting point? Then I clicked discuss and found out everybody's solution was using median to get shortest meeting point? WHY?

    Actually, there is a famous conclusion in statistics that the median minimizes the sum of absolute deviations.


  • 1
    F

    No, you are not, I'm puzzled as well till see this! Very good research, thanks!!


  • 1

  • 0
    T

    Good one, thx!


  • 10
    F

    I didn't read the article in your link, but I came up with the idea myself(Trying some 1-D special cases and then a proof by contradiction). Then, I wrote the code based on it, only to find it not working. So, I re-examined the idea, and finally figured out why it was wrong in this problem.

    The reason is: In this problem, the absolute deviations at a certain position is weighted by the number of points that are actually at this position. If we have a bunch of coordinates, and at each coordinate, there is exactly one point, then the optimal solution is definitely the middle point. But now, in a single row or column, there may exist multiple points. So, the optimal solution should be biased towards the region with denser points.

    As a simple example, suppose the coordinates are 0, 50 and 100. If there are one point at each position respectively, then the meeting point would be 50, definitely. But suppose there are 1000 points at 0, and other two points each at 50 and 100, respectively. Of course, you'll have to choose 0 as the meeting point. Otherwise, the 1000 points at coordinate 0 will have to move to at least 50.


  • 0
    S

    The article you point to is nice, but that is minimizing |s-x|, whereas in the question we need to minimize |s-x|*s, which doesn't necessarily have to be the median.


  • 3
    R

    Let's think about simple cases in one dimension first:

    • house location: [1,6] -> best point can be anywhere between 1~6
    • house location: [1,2,6] -> best point is 2, because 1 and 6 don't care where it is as long as the point is between them
    • house location: [1,2,3,6] -> best point is 2.5(actually 2 or 3)
      For a sequence [a1, a2 ... an], dist(a1,bestPoint)+dist(an,bestPoint) is constant and equal to dist(a1,an)

  • 0
    L

    You can prove it by induction.. Very standard high school stuff.
    First we argue that we can look at 1-d. Because the two dimension can actually be computed seperately.

    n =1, it's the median point. We know this is the best point.
    n=2, there are two middle point x0,x1, (x0<=x1) pick any one. We also know this is the best point. Actually the min distance is |x0-x1|. You can pick any point between x0 to x1. pick another point x<x0 or x>x1, you will pay extra.
    n=1,2 are our base cases, we prove the median point is the best point.
    For n>=3. We pick the leftmost point x0 and rightmost point xn. so we reduce to n-2, in the remaining n-2 points we already prove the median point of the n-2 points is the best.. And this median point is also the median point of the n points..

    I guess you got my point.


  • 0

    Definately not. It took me ~20mins to realize that, though it was so straight forward and intuitive when looked back.


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.