Frugal Python code


  • 17
    import random
    
    class RandomizedCollection(object):
    
        def __init__(self):
            self.vals, self.idxs = [], collections.defaultdict(set)
            
    
        def insert(self, val):
            self.vals.append(val)
            self.idxs[val].add(len(self.vals) - 1)
            return len(self.idxs[val]) == 1
            
    
        def remove(self, val):
            if self.idxs[val]:
                out, ins = self.idxs[val].pop(), self.vals[-1]
                self.vals[out] = ins
                if self.idxs[ins]:
                    self.idxs[ins].add(out)
                    self.idxs[ins].discard(len(self.vals) - 1)
                self.vals.pop()
                return True
            return False 
    
        def getRandom(self):
            return random.choice(self.vals)

  • 0

    It's not O(1)...

    It is now, it wasn't because there was a max(self.idxs[val]).


  • 0

    @StefanPochmann did you read the latest one or the one with max()?


  • 2

    @agave "Both", as the max one was the latest one when I looked :-). Time looks good now, I already changed my comment.


  • 1

    Ok, took me a bit, but now I was able to make it crash:

    rc = RandomizedCollection()
    rc.insert(1)
    rc.insert(0)
    rc.insert(1)
    rc.insert(1)
    rc.insert(1)
    rc.insert(1)
    rc.insert(1)
    rc.insert(1)
    rc.insert(0)
    rc.remove(0)
    rc.remove(0)
    

    For this, I get IndexError: list assignment index out of range for the self.vals[out] = ins.


  • 1

    @StefanPochmann yeah I just saw the problem myself, I am working to fix it... Thanks anyway for pointing out.

    EDIT: found it, I need to first add() to the set before discard() in remove(). Tried with your input and it works.

                if self.idxs[ins]:
                    self.idxs[ins].add(out)
                    self.idxs[ins].discard(len(self.vals) - 1)
    

  • 1
    K

    Great code! A little change:

    # I use arr as your vals, d as your idxs
    def remove(self, val):
            if self.d[val]:
                p = self.d[val].pop()
                last = self.arr[-1]
                self.arr[p] = last
                self.d[last].add(p)
                self.d[last].remove(len(self.arr)-1)
                self.arr.pop()
                return True
            return False
    

  • 0
    L

    Can you explain the return argument of this solution's insert() method?
    If my newly-inserted val is already existed in vals, then as the duplicates are allowed this time, the insert() operation will still succeed, but if you use
    '''
    return len(self.d[val])==1
    '''
    to check it will return False.

    Thanks


  • 0
    R

    clean and clear!


  • 3
    M

    said in Frugal Python code:

    if self.idxs[ins]:

    This check seems unnecessary. The test case will pass without it.

    Simple and concise code, good job @agave!


  • 0

    I very much benefited from reading your code !


  • 1
    P

    Amazing idea!
    I was initially just keeping a dictionary and a list and using list.remove() method.

    Just wanted to clarify the logic for other readers:

    1. The main idea of an efficient speedup is when removing an element. Here you find the index of the element to be deleted as the last item in its dictionary entry (which is a list).
    2. You then swap it with the actual last element in the numbers list you are maintaining. (Some corner cases need to be handled here).

    e.g. 1,2,3,1,5
    If you want to delete 1, swap it with 5 at the end of the list and shrink the list!


  • 0
    L

    @leiyama001 check the example:

    // Inserts 1 to the collection. Returns true as the collection did not contain 1.
    collection.insert(1);
    
    // Inserts another 1 to the collection. Returns false as the collection contained 1. Collection now contains [1,1].
    collection.insert(1);
    

  • 0

    @StefanPochmann @agave Is pop() on set O(1)?


  • 2

    @jedihy I'm not sure. The TimeComplexity - Python Wiki doesn't talk about it. The second result of googling python complexity is Complexity of Python Operations which says it's O(1), but then again it also gives s.pop(i) as the example and that's invalid. Not sure much that page can be trusted.

    A while ago I tested something related (also my other post there), here's a version with just set.pop:

    from timeit import timeit
    print timeit('s.pop()', number=1,
                 setup='s = set(range(10))')
    print timeit('s.pop()', number=1,
                 setup='s = set(range(2*10**7))')
    print timeit('s.pop()', number=1,
                 setup='''
    s = set(range(2*10**7))
    for i in range(2*10**7 - 10):
        s.remove(i)''')
    

    The output shows that it's about equally fast to pop from the small set and from the large set but very slow from the set that started large but ended up small:

    3.30516332615e-06
    3.00469393288e-06
    0.0127308881935
    

    So looks like it doesn't take constant time independent of what the set is. And if I remove the two 2* from the test code, then it takes about half the time. So looks like it's somewhat linear there.

    Anyway, this problem is about average O(1), and set.pop and this solution might still achieve that.


  • 0

    Thanks! That's really helpful. Do you have a solution other than this one? I mean without using pop() on dict.


  • 0

    @StefanPochmann I wrote a more O(1)-like solution without using pop() on dict.

    import collections
    class RandomizedCollection(object):
        def __init__(self):
            """
            Initialize your data structure here.
            """
            self.dd = collections.defaultdict(dict)
            self.d = collections.defaultdict(list)
            self.a = []
    
        def insert(self, val):
            """
            Inserts a value to the collection. Returns true if the collection did not already contain the specified element.
            :type val: int
            :rtype: bool
            """
            self.d[val].append(len(self.a))
            self.dd[val][len(self.a)] = len(self.d[val]) - 1
            self.a.append(val)
            return len(self.d[val]) == 1
    
        def remove(self, val):
            """
            Removes a value from the collection. Returns true if the collection contained the specified element.
            :type val: int
            :rtype: bool
            """
            dd = self.dd
            a = self.a
            d = self.d
            if not d[val]:
                return False
            idx = d[val][-1]
            a[idx] = a[-1]
            idxInDForLast = dd[a[-1]][len(a) - 1]
            d[a[-1]][idxInDForLast] = idx
            dd[a[-1]][idx] = idxInDForLast
    
            del dd[a[-1]][len(a) - 1]
            d[val].pop()
            a.pop()
            return True
    
        def getRandom(self):
            """
            Get a random element from the collection.
            :rtype: int
            """
            return self.a[random.randrange(0, len(self.a))]
    

Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.