Uber find the largest kth element in a data stream


  • 0

    Suppose you have a system, the input is an dynamic integer data stream. Suppose the data stream contains duplicate value. Find the kth largest elements in the data stream and return the index. If the duplicate values exist in multiple index, return them all.
    Eg: The data stream is 4,2,2,3,1, k is 4, the answer is 2. The index is 1,2.
    The data steam is 1,2,3,4,5, k is 3, the answer is 3 and the index is 2.


  • 0
    This post is deleted!

  • 0

    @hilda8519 Could you please add at least one example to your problem description?

    For example, if the data stream is 4,2,2,3,1 and k=4, is the answer 2 or 1?


  • 0

    In example 4,2,2,3,1, k = 4 and the fourth largest number is 2 , we have 4 ,3,2,2 and we must return 1,2 because the index of two 2 is 1 and 2 in the stream
    I suggest to use max heap for first largest k elements and to keep indexes of the element on the top of the heap in order to return the result


  • 0
    Z

    I think hashTable with quick select algorithm can solve the problem. key is the number, value is a List<Integer> which contains the indexes of the corresponding number. Then we can solve it in linear time(best case)


  • 1
    G

    I think we have a very similar one in Leetcode question list.
    Still remember some genius' code was to store pair<int value, int idx> in a heap, then keep drawing them out.


  • 6
    J

    @hilda8519 We can use MinHeap + hashatable, minheap to maintain k largest element, use hashtable to store the count. when element is removed from heap, it's also removed from the hashtable. Space is O(K).


  • 0
    G

    @hilda8519 says it's a "dynamic integer data stream", does it mean that we can get only one number each time, not the whole list all together?


  • 1

    @GoGoDong Yes. Usually it means that the data stream is very large and as a result you may not store all data in memory.


  • 0
    D

    I think the min heap + hash table approach should be fine. One catch: We should remove the hashtable key as soon as the element is removed from heap, because that element can never come back into the heap.

    space complexity: O(K)


  • 0
    J

    Using Min heap with each node record it position in stream . Don't understand why Use Hash with Heap . Use Heap with record arrival index into heap as information should be sufficient


  • 0

    @ZhackerZ No, that won't work with stream.


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.