20-line C++ O(n^2) Hashing Solution


  • 31
    L

    The idea is straight forward. Calculate each slope between two points and handle two special cases: 1. vertical, 2. duplicate.

    class Solution {
    public:
        int maxPoints(vector<Point>& points) 
        {
            if(points.size()<=2) return points.size();
            int res=0;
            for(int i=0;i<points.size()-1;i++) {
                int numVertical=1,local=1,duplicate=0;
                unordered_map<double,int> map;
                for(int j=i+1;j<points.size();j++) 
                    if(points[i].x==points[j].x) // special cases
                        if(points[i].y==points[j].y) // duplicate 
                            duplicate++;
                        else // vertical
                            numVertical++;
                    else {
                        double slope=(points[i].y-points[j].y)*1.0/(points[i].x-points[j].x);
                        map[slope]==0?map[slope]=2:map[slope]++;
                        local=max(local,map[slope]);
                    }
                local=max(local+duplicate,numVertical+duplicate);
                res=max(res,local);
            }
            return res;
        }
    };

  • 0
    W
    This post is deleted!

  • 0
    M

    same slop doesn't mean they are in the same line, there is a constaint need be the same.


  • -4
    G

    how this code working??? Same slope doesn't guarantees co-linearity?? May be parallel too..


  • 1
    P

    this code is right! it first calculates all lines passing through point a, and select the max num. then it iterates to point b and calculates its passing line(but not a line through point a again), gather all the max through each point, and you got the final max!


  • 2
    A

    This code is correct.
    The idea is to set a point, and then find all slopes between this point and other points. As a result, for a certain point, we say A, if slope of A-to-B, and the slope of A-to-C is the same, B and C must be on the same line. After considering point A, we would never need A anymore. Because all points on lines that contains A had been calculated in the step we calculating Point A.


  • 0
    N

    Notice the position of this statement.

    unordered_map<double,int> map;

    In every iterator, map is initialized. Then you can understand why this code can work.


  • 0
    F

    excellent idea and expression~


  • 0
    I

    Really great idea. Appreciate your solution.


  • 23
    F

    I can't trust the way using double as key, unless somebody could prove that,

    • for any Simplified Rational Expressions p/q, where p and q are uint32, p * 1.0 / q is always distinct

    Note that, if p and q are uint64, it would be buggy,

    long long p = (1LL << 53);
    long long q = 1LL;
    double rate1 = (double)p * 1.0 / q;
    double rate2 = (double)(p + 1) * 1.0 / q;
    if (rate1 == rate2) cout << "catch!\n";
    

    UPDATE, I found some examples for int32 cases,

    in python3, try following commands,

    94911149/94911150 == 94911150/94911151  # false
    94911150/94911151 == 94911151/94911152  # true
    94911151/94911152 == 94911152/94911153  # false
    

    Now we could construct a test case,

    [[0,0],[94911151,94911150],[94911152,94911151]]
    # this code return 3, while Expected Answer is 2
    

    In fact, it's very easy to find billions of examples,

    for (int m = 0x7FFFFFFF; m > 0; m--) {
        double rate1 = (double)(m - 1) * 1.0 / m;
        double rate2 = (double)(m - 2) * 1.0 / (m - 1);
        if (rate1 == rate2) cout << "catch " << m << endl;;
    }
    

    Be careful of double. Don't use it like integer.

    ----====----

    btw, in this case, for a simplified rational fraction p/q, long long could be used as an key, but there are a few corner cases need to be handled,

    long long key = ((long long)q << 32) | ((long long)p & 0x00000000FFFFFFFF);

  • 0

    @forsaken This is actually also my concern without testing it (thanks for the detailed tests!).
    For losing of accuracy, I guess using doubleas key for hashmap is not reliable after all.
    In pure match, it makes perfect sense to denote just a real number x, but in computer memory everything is reduced to some rational number up to a certain degree of accuracy. I guess this is why so many (interview) coding problems are given in the form of integer containers just to simplify the numerical detail without losing the actual purpose of algorithm challenge. However, for this problem particularly, lines connecting pair<int,int>type of points could very well result in doubletype of slopes.


  • 0
    F

    what ??????! double deviation!!!!!!!!!!!!


  • 1
    L

    Thanks. A cleaner version:

    class Solution {
    public:
        int maxPoints(vector<Point>& points) {
            int ans = 0;
            for (int i = 0; i < points.size(); i++) {
                unordered_map<double, int> count;
                int dup = 0;
                for (int j = i; j < points.size(); j++) {
                    int deltaX = points[i].x - points[j].x;
                    int deltaY = points[i].y - points[j].y;
                    if (deltaX == 0 && deltaY == 0) {
                        dup++;
                    } else if (deltaX == 0) {
                        count[INT_MAX]++;
                    } else {
                        count[(double)deltaY / deltaX]++;
                    }
                }
                ans = max(ans, dup);
                for (auto p: count) {
                    ans = max(ans, p.second + dup);
                }
            }
            return ans;
        }
    };
    

  • 1
    S

    @mifowa Not really. The lines have a common point, namely points[i]. Therefore, if the slopes are the same, they are in the same lines.


  • 6

    @forsaken Excellent. I thought 64-bit doubles might suffice for 32-bit ints, but hadn't thought it through, and now I see why it fails.

    You're comparing x/(x+1) with (x-1)/x. The difference is:

    x/(x+1) - (x-1)/x
    = (x2 - (x-1)(x+1)) / ((x+1)x)
    ≈ 1 / x2.

    And both values are close to 1. Since the fraction part of doubles has 52 bits and the above difference involves the square of x, we should expect a collision when x has more than 26 bits or so. And in fact that's what happens. The smallest positive integer x where it fails is 67114657, which is a pretty small 27-bit number (it's 100000000000001011010100001 in binary).

    Maybe long double is safe :-)


  • 1
    J

    I use double as the hash key at first, but it's wrong in some cases. I guess they add some new test cases so that double won't work anymore.
    To replace, I write a slope struct to represent the slope of the line through two points, make a hasher slope_hashfor the slope struct, then replace unordered_map<double, int> with unordered_map<slope, int, slope_hash>, it works well.


  • 0

    @Jacobean said in 20-line C++ O(n^2) Hashing Solution:

    I use double as the hash key at first, but it's wrong in some cases. I guess they add some new test cases so that double won't work anymore.
    To replace, I write a slope struct to represent the slope of the line through two points, make a hasher slope_hashfor the slope struct, then replace unordered_map<double, int> with unordered_map<slope, int, slope_hash>, it works well.

    Yes, that's basically the idea. Using double as key is not stable due to accuracy (which has been discussed before). The tricky part of this problem is how to come up a good hasher to distinguish different slopes.

    I have a solution below to define a customized slope order in std::map without extra struct if you are interested:
    6-line C++ concise solution, NO double hash key or GCD for slope.


  • 3
    R

    Your solution fails with: [[0,0],[94911151,94911150],[94911152,94911151]]


  • 0
    L

    fail with test case

    Input:
    [[0,0],[94911151,94911150],[94911152,94911151]]
    Output:
    3
    Expected:
    2


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.