anybody came up with the O(n) time and O(k) space solution?


  • 1
    J

    I spend several hours on this follow up , I trie to apply heap, bucket sort, and even similar partition method like what we did on problem "top k elements", but all failed because I can not think about how we can do that with O(k) space if a hashmap used. And as long as we use something like heap the time cost will be O(nlogk).
    Then I saw "Trie" in the tag, and looked for some materials, finally I found one solution online that solve this problem with trie:
    http://www.geeksforgeeks.org/find-the-k-most-frequent-words-from-a-file/
    This is a heap+trie solution. But I think it's still not good enough for O(n) time O(k) space.
    In this solution, the heap's size is restricted to O(k) but the trie could go very large if we have long strings. Also, though we solve it in one pass but we still need O(logk) time on each turn.
    So any genius have great solutions?


  • 2
    D

    I'm thinking about the same thing but I think there are few ideas that I have might help:

    1. If an element appears more than n/k times, then it must belong to the result.
    2. If we take one word at a time, each time we probably need O(1) operation to update things since if you are able to move it from one place to the next place in O(1) time.

    Other than these I'm also stuck. If it is possible, there must be some other structure to the problem we have not discovered (Or there is no such possibility).


  • 2

    It's impossible. If k = n then the problem is equal to sorting which is omega(n log n) time complexity.

    So let's relax the problem: say we don't care about the order of the answer, and there is a unique answer. That means there is some smallest count x for which we want exactly all words that occur x times or more.

    But this is still not possible. Suppose k = 1. If space is O(k), there exists some constant M for which we never use more than M space. That means after looking at the first M + 1 elements, we must keep knowledge of at most M of them. Now consider the M+1 arrays [x_0, x_1, ...., x_M, x_i] over i = 0...M. We can't distinguish the correct answer.


  • 0
    J

    @awice Thank you, nice proof by contradiction.


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.