Find the top 100 hot queries/searches on Amazon?


  • 1

    "People are interested to see what are the top searches in Amazon.com, how do you design a program to return the top 100 frequent queries/searches on Amazon.com?"


  • 0
    K

    Use std::map<std::string, int>;
    e.g.
    std::map<std::string, int> search_history;
    search_history["Men's shoes"]++; // value of search_history["Men's shoes"] is the number of occurrences.

    Loop all the elements in the map and get the 100 most higher number of occurrences (sorting from descending order).


  • 0

    @kuangy

    What's your time-complexity then?
    What if the hits for each query are constantly changing?
    What's your memory-cost?


  • 0
    S

    Using a priority queue of size 100 with priority as the hits of the page we can get the top 100 hot queries .


  • 0
    S
    This post is deleted!

  • 0

    @saneGuy Excellent! But you also want to be more specific: it's a min-heap or max-heap?


  • 0
    S

    it should be a min heap


  • 4
    S

    @fishercoder just like using a min heap in the kth largest number in an array. We will create a min heap with 100 capacity. Scan through the first 100 queries and add them to the min heap. After that from 101th query add check if the priority of the element at the top is less then the priority of the 101th element if it is remove the top and add the 101 element to the heap. Keep on doing that till the end of the queries list. return top of the heap.


  • 0

    @saneGuy You nailed it! Thumbs-up!


  • 0
    S

    @fishercoder cool ..happy coding


  • 0
    J

    @saneGuy Why should be min heap? it sounds like a max heap. You sort based on the hit number. And keep the one with most hits as root.


  • 0
    S

    @judezhu with min heap as i said we will remove the top element which is a minimum so at the end the heap will have the required output. If max heap is used we have to remove the max of the heap which doesn't work. Does that make sense?


  • 1

    Below is the implementation using HashMap and PriorityQueue

    package Amazon;
    
    import java.util.ArrayList;
    import java.util.Comparator;
    import java.util.HashMap;
    import java.util.List;
    import java.util.Map;
    import java.util.PriorityQueue;
    
    public class TopHotQueries {
    	class Query {
    		String query;
    		int count;
    
    		public Query(String query, int count) {
    			this.query = query;
    			this.count = count;
    		}
    	}
    
    	private Map<String, Query> map;
    	private PriorityQueue<Query> pq;
    	private static final int LIMIT = 3;
    
    	public TopHotQueries() {
    		map = new HashMap<>();
    		pq = new PriorityQueue<>(LIMIT, new Comparator<Query>() {
    			@Override
    			public int compare(Query q1, Query q2) {
    				return q2.count - q1.count;
    			}
    		});
    	}
    
    	private Query insertToMao(String queryStr) {
    		Query query;
    
    		if (map.containsKey(queryStr)) {
    			query = map.get(queryStr);
    		} else {
    			query = new Query(queryStr, 0);
    		}
    
    		query.count = ++query.count;
    		map.put(queryStr, query);
    
    		return query;
    	}
    
    	private List<Query> getTopK() {
    		List<Query> queryList = new ArrayList<>();
    		PriorityQueue<Query> hotPQ = pq;
    		
    		int counter = 0;
    		while(!hotPQ.isEmpty()) {
    			queryList.add(hotPQ.poll());
    			
    			++counter;
    			
    			if(counter == LIMIT) {
    				break;
    			}
    		}
    		
    		return queryList;
    	}
    
    	private void insertToPQ(Query prevQuery, Query curQuery) {
    		if (null != prevQuery) {
    			pq.remove(prevQuery);
    		}
    
    		if (pq.size() == LIMIT) {
    			if (pq.peek().count < curQuery.count) {
    				pq.poll();
    			}
    		}
    
    		pq.add(curQuery);
    	}
    
    	public void insert(String queryStr) {
    		Query prevQuery = map.get(queryStr);
    
    		Query curQuery = insertToMao(queryStr);
    		insertToPQ(prevQuery, curQuery);
    	}
    
    	public static void main(String[] args) {
    		// TODO Auto-generated method stub
    		TopHotQueries thq = new TopHotQueries();
    		thq.insert("tablet");
    		thq.insert("kindle");
    		thq.insert("mac");
    		thq.insert("earpod");
    		thq.insert("books");
    		thq.insert("tablet");
    		thq.insert("earpod");
    		thq.insert("kindle2");
    		thq.insert("tablet");
    		thq.insert("tablet1");
    		thq.insert("tablet2");
    		thq.insert("tablet");
    		thq.insert("mac");
    		thq.insert("earpod");
    		thq.insert("mac");
    
    		for(Query top : thq.getTopK()) {
    			System.out.println(top.query + " : " + top.count);
    		}
    	}
    
    }
    
    

Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.