Two fast ways to do this in Python, but one is noticeably faster than the other


  • 1
    M

    I created two solutions to this, first the dictionary form (add the numbers as keys to a dictionary, checking first to see if the number is already present. If it is, bail out and return accordingly), and then the second where I compare len(set(list())) vs len(list()).

    I then looked in the discussions here and, seeing lots of variations on both themes I wondered which would be faster. I created a test (below) and ran it a few times (the numbers were pretty consistent). The result is that, for a large array of unique numbers, the set method is > 2x faster than the dict method.

    However, this advantage is greatly diminished in cases where the list has a large fraction of randomly scattered duplicates (or strategically located ones, as in the tests below). Then, the dict() method can surpass the set() method by virtue of the fact that the dict() method can short-circuit (while the set() must be filled before any conclusion can be drawn).

    Note: timings are virtually the same between Python 2.7 and 3.5

    class Solution(object):
        def containsDuplicate1(self, nums):
            """
            :type nums: List[int]
            :rtype: bool
            """
            d = {}
            for i in nums:
                if i in d:
                    return True
                else:
                    d[i] = 1
            return False
    
        def containsDuplicate2(self, nums):
            """
            :type nums: List[int]
            :rtype: bool
            """
            s = set(nums)
            if len(s) < len(nums):
                return True
            else:
                return False
            
    
    def testit(list_in):
        s = Solution()
    
        import time
    
        t1 = time.time()
        soln1 = s.containsDuplicate1(list_in)
        t1delta = time.time()-t1
    
        t2 = time.time()
        soln2 = s.containsDuplicate2(list_in)
        t2delta = time.time()-t2
    
        print("#1 dict()", soln1, t1delta)
        print("#2 set() ", soln2, t2delta)
    
    if __name__ == '__main__':
        a = list(range(16*1024*1024))
        testit(a)
        a.append(1)
        testit(a)
        a[16*1024*1024//2]=1
        testit(a)
        a[0]=1
        testit(a)
    
    

Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.