Top K Frequent Words


  • 0

    Click here to see the full article post


  • 0
    T

    Add N words to the heap should take O(log N) time for each. So the complexity of approach 2 should also be O(log N)


  • 0

    @tuintuin The maximum size of the heap is k, so each push to the heap has complexity O(log k). Hence O(N log k). In Python, we can heapify in linear time as discussed here.


  • 0
    H

    The 2nd Python solution looks like O(Nlog(N)). You should use similar techniques such as heapreplace.


  • 0
    R

    Why isn't the sorting O(k log k)? There are only k entries in the HashMap, so I think the sorting solution has time complexity of O(N + k log k), which is not necessarily slower than the heap solution of O(N log k).


  • 0

    @ryanleitaiwan Each push/pop has complexity O(log k) because there at most k items. However, N pushes are made.


  • 0
    R

    @awice I understand the Heap Solution (N pushes, k pops), but I'm wondering about the Sorting Solution.
    Actually, I misunderstood the notations. 1 ≤ k ≤ number of unique elements (call this U) ≤ N, then precisely speaking, the Sorting Solution has a time complexity of O(N + U log U). However, in average scenarios, we can expect 1 < k << U < N, so Heap should be faster.


  • 0
    W

    I don't see how this ensures that words with same frequency are ranked in alphabetical order


  • 0

    @WhiterMeerkat Assuming you are talking about the heap:
    In Java, the comparator is
    (w1, w2) -> count.get(w1) != count.get(w2) ?
    count.get(w1) - count.get(w2) :w2.compareTo(w1)
    In Python, it is sorting by
    (-freq,word)

    The bolded areas indicate when it is sorting by word in the case of ties.


  • 0
    W

    @awice for the Python solution, isn't this
    heap = [(-freq, word) for word, freq in count.items()]
    adding word with priority -freq into the heap?
    I'm not sure how this is different from
    h = [] for (freq, word) in count.items(): heappush(h, (-freq, word))
    which does not work.


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.