Top K Frequent Words


@ryanleitaiwan Each push/pop has complexity O(log k) because there at most k items. However, N pushes are made.

@awice I understand the Heap Solution (N pushes, k pops), but I'm wondering about the Sorting Solution.
Actually, I misunderstood the notations. 1 ≤ k ≤ number of unique elements (call this U) ≤ N, then precisely speaking, the Sorting Solution has a time complexity of O(N + U log U). However, in average scenarios, we can expect 1 < k << U < N, so Heap should be faster.

@WhiterMeerkat Assuming you are talking about the heap:
In Java, the comparator is
(w1, w2) > count.get(w1) != count.get(w2) ?
count.get(w1)  count.get(w2) :
w2.compareTo(w1)
In Python, it is sorting by
(freq,
word
)
The bolded areas indicate when it is sorting by word in the case of ties.

@awice for the Python solution, isn't this
heap = [(freq, word) for word, freq in count.items()]
addingword
with priorityfreq
into the heap?
I'm not sure how this is different from
h = [] for (freq, word) in count.items(): heappush(h, (freq, word))
which does not work.