Interview Question


  • 0
    M

    I encountered an interview question today but couldn't figure out the correct solution, here's the question:

    Given an integer W that represents number of words,
    Write functions insert(word) and getMostPopularWord() such that getMostPopularWord() will always return the most popular word in the last W number of words.

    The goal is to make insert(word) as efficient as possible.

    Appreciate the help!


  • 0
    L

    Can explain more about insert(word)? Is by inserting the same word over and over increase the popularity of the word? Is the word inserted with a popularity number? What happens if you insert the same word twice?


  • 0
    M

    insert(word) is supposed to insert a word into some array of length W so that we can track the most popular word in the last W words inserted. While insert(word) is called, at any time getMostPopularWord() should return the most frequent word in the last W words. Here's an example:

    let W = 3
    insert("A")
    insert("A")
    getMostPopularWord() => "A"
    insert("B")
    getMostPopularWord() => "A"
    insert("B")
    getMostPopularWord() => "B" // since the first inserted "A" is out of the scope of the last 3 words


  • 0
    L

    @mutaphore
    Say W = 2, is this correct?
    insert("A")
    insert("A")
    getMostPopularWord() => "A"
    insert("B")
    insert("B")
    getMostPopularWord() => "B"
    Now this is where this gets confusing. If I insert a new word C, since W=2, A and B are twice as popular as C and it will never save C???
    insert("C") -> C will never be saved into the array??
    insert("C")
    getMostPopularWord() => "B"????

    Isn't the question suppose to be get the last frequently used word? The word 'popular' can be misunderstood for something else. In-order to do this, you must have a queue or heap somewhere. Are you allowed to use it, it seems like the array W is unnecessary. What does the array do in relationship to the question? In terms of the user, the user doesn't care about this array and this W. The user just wants to call those two functions.

    EDIT: Nevermind, I think I understand. W represent 'memory' of the last inserted words.


  • 0
    L

    @mutaphore
    N is number of words
    getMostPopularWord will run at O(1)
    insert(word) will run at O(Nlog(N))
    space complexity will be ~N

    from collections import deque
    
    class solution(object):
        def __init__(self, n_words_to_hold):
            self.n_words_to_hold = n_words_to_hold
            self.word_queue = deque()
            self.word_to_sorted_hash = dict()
            self.sorted_popularity = list()
        
        def insert(self, word):
            if len(self.word_queue) == self.n_words_to_hold:
                oldest_word = self.word_queue.pop()
                self.decrement_word(oldest_word)
            self.word_queue.appendleft(word)
            self.increment_word(word)
            self.sorted_popularity.sort(custom_compare)
            print 'sorted {}: {}'.format(word, self.sorted_popularity)
            print self.word_queue
            
        def getMostPopularWord(self):
            if len(self.sorted_popularity) != 0:
                return self.sorted_popularity[0][1]
            return ''
        
        def increment_word(self, new_word):
            popularity = 1
            if new_word in self.word_to_sorted_hash:
                node = self.word_to_sorted_hash[new_word]
                popularity += node[0]
                self.sorted_popularity.remove(node)
            new_node = [popularity, new_word]
            self.word_to_sorted_hash[new_word] = new_node
            # Trick to insert at beginning so that the sort will place priority for elements in the beginning
            # There might be a better way than this
            self.sorted_popularity.insert(0,new_node)
            
        def decrement_word(self, old_word):
            old_node = self.word_to_sorted_hash[old_word]
            old_node[0] -= 1
        
    def custom_compare(a, b):
        if b[0] >= a[0]:
            return 1
        return -1
    
    myObj = solution(2)
    myObj.insert('A')
    myObj.insert('B')
    myObj.insert('A')
    print myObj.getMostPopularWord()
    

    I think the sort can be improved on. We can possibility get better than N(log(N)) for insert(word). I was thinking bucket sort, but other solutions will lose the popularity priority after sorting. The thing is, we don't need to sort everything, we just want the group of words that has the most, then apply the most frequent rule on them. I don't believe this is the most optimized solution yet. I believe you can get insert(word) to run at O(n).

    I'll come back here when I think of something. Sleepy time...


  • 0
    L

    Couldn't sleep ;-P

    Thought about it some more and here is what I got.
    Insert(word) is now O(1).
    GetMostPopularWord() is now O(n).
    Space complexity is the same with O(n)

    GetMostPopularWord will be doing one pass to find the most popular number. Remember we can have more than one popular word. Then the second pass is to find the words that match that popular number and compare them to the counter, and select the more recently added word out of that group. Effectively O(n) two pass approach.

    Still using a queue to mainly keep track of the last word I should decrement with.
    I'm using a hash to keep track of my word to popularity count.
    I've added a new hash table to keep track of when each word was added. I was considering to just search thru the queue until I hit the most frequent used word out of my most popular list of words but if the word list is large, and your most popular words are at the end of queue, it may not be as good compared to my approach.

    EDIT: Updated getMostPopularWord to do O(n) with just one pass. Cheers!

    from collections import deque
    import sys
    
    class popular_words(object):
        def __init__(self, n_words_to_hold):
            self.n_words_to_hold = n_words_to_hold
            self.word_queue = deque()
            self.word_counter = dict()
            self.word_to_popularity = dict()
            self.counter = -sys.maxint - 1
            
        def insert(self, new_word):
            if self.n_words_to_hold == 0:
                return
            if new_word in self.word_to_popularity:
                self.word_to_popularity[new_word] += 1
            else:
                self.word_to_popularity[new_word] = 1
            if len(self.word_queue) == self.n_words_to_hold:
                oldest_word = self.word_queue.pop()
                self.word_to_popularity[oldest_word] -= 1
            self.word_queue.appendleft(new_word)
            self.counter += 1
            self.word_counter[new_word] = self.counter
            
        def getMostPopularWord(self):
            most_popular_num = 0
            most_popular_word = ''
            lastest_counter = -sys.maxint - 1
            for word, popularity in self.word_to_popularity.items():
                if popularity > most_popular_num:
                    most_popular_num = popularity
                    lastest_counter = self.word_counter[word]
                    most_popular_word = word
                elif popularity == most_popular_num:
                    if self.word_counter[word] >= lastest_counter:
                        lastest_counter = self.word_counter[word]
                        most_popular_word = word
            return most_popular_word
        
    test = popular_words(4)
    test.insert('A')
    test.insert('B')
    print test.getMostPopularWord()
    test.insert('A')
    print test.getMostPopularWord()
    test.insert('C')
    print test.getMostPopularWord()
    

Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.