C++ using sorted vector and std::lower_bound


  • 0
    M

    As an alternative to std::map, I often find it's nicer to use a sorted vector and run a binary search (aka 'lower_bound').

    I'm starting to have doubts about the timing accuracy of leetcode across languages. This runs in ~116 ms and 'beats 86.9% ...'

            vector<int> findRightInterval(vector<Interval>& intervals)
            {
                auto N = static_cast<int>(intervals.size());
    
                struct Item {
                    int value;
                    int index;
                };
    
                vector<Item> starts(N + 1);
                for (int i = 0; i < N; ++i)
                {
                    starts[i] = { intervals[i].start, i };
                }
    
                // 'not found' value
                starts.back().index = -1;
    
                auto cmpval = [](const Item& lhs, const Item& rhs)
                { return lhs.value < rhs.value; };
    
                sort(begin(starts), end(starts) - 1, cmpval);
    
                vector<int> res(N);
                for (int i = 0; i < N; ++i)
                {
                    Item item = { intervals[i].end, i };
    
                    auto iter = lower_bound(
                        begin(starts), end(starts) - 1, 
                        item, cmpval);
    
                    res[i] = iter->index;
                }
    
                return res;
            }
    

  • 0

    So this is similar to use map::lower_bound if you define a map: start -> index, right? Did you test if there is any performance differences if using built-in map::lower_bound, self-made vector<struct> and vector<pair<int,int>>? I am just curious.


  • 1
    M

    @zzg_zzm
    Yes, this can done with a map. I've looked at performance in other contexts, but not for this problem. I've found that there is often -- but not always -- little appreciable performance difference between a map and doing the sort+search manually on a vector. However, occasionally there is a noticeable difference in favor of a vector and it can be large, like x2-x10, especially when doing multiple searches. Presumably this is because of inherent cache friendliness of the vector.

    I've followed some of the recent talks at things like cppcon and there has been a lot of map bashing the last couple years. I think this is more relevant for performance-critical code, whereas much everyday code is fine using a map (or an unordered_map).

    I've personally found vectors easier to test and debug with, at least for smallish collections (e.g. < 1024 items).


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.