Concise O(1) JAVA solution based on HashMap


  • 7

    The basic idea is using prefix tree to implement this class, and store the children of each node.

    Time complexity(Insert, Search) = O(keyLength)

    For each char in one word, we only need O(1) time to get it from children array. So the total time = O(1 * keyLength.

    Extra Space = O( numOflevel[0] * numOflevel[1] .. numOflevel[longestWord] )

    numOflevel is the number of char on each level for the words in the dictionary.

    If use Array to store children, it will cost O( 26 ^ (len(longestWord) - 1) ) extra space:

    For each char in one word, we have to store 26 children, so the total extra space should be O( 26 ^ (length of the longest word in the dictionary - 1) ) )

    To sum up, HashMap only needs the accurate necessary extra space, it will cost far less space than Array, and it can also support any ASCII char, e.g. 'A' - 'Z'.

    public class Trie {
        private TrieNode root;
        public Trie() {
            root = new TrieNode();
        }    
        public void insert(String word) {
            TrieNode node  = root;
            for (int i = 0; i < word.length(); i++) {
                char ch = word.charAt(i);
                if (node.children == null) 
                    node.children = new HashMap<Character, TrieNode>();
                if (!node.children.containsKey(ch)) 
                    node.children.put(ch, new TrieNode(ch));            
                node = node.children.get(ch);
            }
            node.isLeaf = true;
        }    
        public boolean search(String word) {
            TrieNode node  = root;
            for (int i = 0; i < word.length(); i++) {
                char ch = word.charAt(i);
                if (node.children == null || !node.children.containsKey(ch))
                    return false;
                node = node.children.get(ch);
            }
            return node.isLeaf;
        }     
        public boolean startsWith(String prefix) {
            TrieNode node  = root;
            for (int i = 0; i < prefix.length(); i++) {
                char ch = prefix.charAt(i);
                if (node.children == null || !node.children.containsKey(ch))
                    return false;
                node = node.children.get(ch);
            }
            return true;
        }
    }
    class TrieNode {
        public Map<Character, TrieNode>children = null;
        public boolean isLeaf = false;
        public char val;
        public TrieNode() {}
        public TrieNode(char val) {
            this.val = val;
        }
    }

  • 0
    B

    can you explain more on the space usage? thanks
    Extra Space = O( numOflevel[0] * numOflevel[1] .. numOflevel[longestWord] )

    on this web,
    http://www.geeksforgeeks.org/trie-insert-and-search/
    insert and search costs O(key_length), however the memory requirements of trie is O(ALPHABET_SIZE * key_length * N) where N is number of keys in trie.

    the space is different from yours.


  • 0
    C
    This post is deleted!

  • 0
    C

    I think your space analysis is wrong as well. Consider the following argument:

    We can think of N insert operations to obtain the worst case scenario. Consider that we add the set of strings S = {s1, s2, ..., sN}. For every s_i, you go through every character and take a "branch" from this tree. If at some point a particular branch doesn't exist you create a new node in the corresponding children array. As a result, it is easy to see that you will create at most H = |s1| + |s2| + ... + |sN| nodes, where |s| = no. characters of string s.

    Now let's look at the 2 implementations. It comes down to how much space does a node occupy. Note that the implementation using an array of 26 entries to represent a node's children uses approximately O(26) space (I'm not going to go into the exact byte usage analysis). With the map implementation for the children entries, you'll probably occupy less space in average ; however, the upper bound is still O(26) (all children exist). The total space usage is thus O(26 * H) in both cases (it's an upper bound for both). H can be estimated by averagewordlength * N.

    In conclusion, the overall complexity is O(SIGMA * MAXKEYLENGTH * N). In practice, I guess O(SIGMA * AVERAGEKEYLENGTH * N) is a pretty good approximation as well. In any case, the important thing to see is that the space usage is polynomial, not exponential.


  • 0

    Note that since you create a HashMap with the default constructor, it creates a table with default size of 16 once you put in the first value, and then doubles it to 32 if the number of entries exceeds the size times the default load factor of 0.75, that is, 12.


  • 0

    stachenov, thanks! It's really helpful!


  • 0

    Thanks for the comment, cosmin79! O( 26 ^ (len(longestWord) - 1) ) is the accurate maximum space we need when N = max_infinity.


  • 0
    E

    HashMap is a very good way to handle general cases. That is, when the words are not limited to ASCII. However, for this problem, I don't think HashMap outperforms array. First of all, as commented by @stachenov, HashMap is of size 16 by default, which is in the same order as 26. Secondly, HashMap reduces the speed heavily (according to OJ accepted submissions, it is around 1/3 of the speed of using array on average). Third, since it is highlighted (by the interviewer) that all words are consist of lower case letters, using HashMap is kind of ignoring this highlighted detail.


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.