Java O(nlogk) using TreeMap to keep last occurrence Interview "follow-up" question!


  • 23

    Solving the problem with O(n) time is not enough, some interviewer may require this solution as a followup. Instead of recording each char's count, we keep track of char's last occurrence. If you consider k as constant, it is also a O(n) algorithm.

    inWindow keeps track of each char in window and its last occurrence position

    lastOccurrence is used to find the char in window with left most last occurrence. A better idea is to use a PriorityQueue, as it takes O(1) to getMin, However Java's PQ does not support O(logn) update a internal node, it takes O(n). TreeMap takes O(logn) to do both getMin and update.
    Every time when the window is full of k distinct chars, we lookup TreeMap to find the one with leftmost last occurrence and set left bound j to be 1 + first to exclude the char to allow new char coming into window.

       public class Solution {
            public int lengthOfLongestSubstringKDistinct(String str, int k) {
                if (str == null || str.isEmpty() || k == 0) {
                    return 0;
                }
                TreeMap<Integer, Character> lastOccurrence = new TreeMap<>();
                Map<Character, Integer> inWindow = new HashMap<>();
                int j = 0;
                int max = 1;
                for (int i = 0; i < str.length(); i++) {
                    char in = str.charAt(i);
                    while (inWindow.size() == k && !inWindow.containsKey(in)) {
                        int first = lastOccurrence.firstKey();
                        char out = lastOccurrence.get(first);
                        inWindow.remove(out);
                        lastOccurrence.remove(first);
                        j = first + 1;
                    }
                    //update or add in's position in both maps
                    if (inWindow.containsKey(in)) {
                        lastOccurrence.remove(inWindow.get(in));
                    }
                    inWindow.put(in, i);
                    lastOccurrence.put(i, in);
                    max = Math.max(max, i - j + 1);
                }
                return max;
            }
        }

  • 18
    Y

    To clarify the possible follow up. The interviewer may say that the string is given as a steam. In this situation, we can't maintain a "left pointer" as the classical O(n) hashmap solution.


  • 1

    beautiful! I just see someone met this followup question during onsite interview.


  • 2
    Z

    I don't understand why we need the while, we can use if instead, and it passes the test cases.


  • 1
    Z

    and isn't the lastOccurrence maintain the right most occurrence?


  • 1
    1

    @ZheYu The lastOccurrence maintains the rightmost index of that element and lastOccurrence + 1 is the leftmost of the slide window.


  • 0
    H

    Thanks for you nice solution!
    Some minor changes of the code structure to make it easier to understand, passed all the test cases:

    public int lengthOfLongestSubstringKDistinct(String str, int k) {
            if (str == null || str.isEmpty() || k == 0) return 0;
            TreeMap<Integer, Character> lastOccurrence = new TreeMap<>();
            Map<Character, Integer> inWindow = new HashMap<>();
            int j = 0, max = 1;
            for (int i = 0; i < str.length(); i++) {
                char in = str.charAt(i);
                //update or add in's position in both maps
                if (inWindow.containsKey(in)) {
                    lastOccurrence.remove(inWindow.get(in));
                }
                inWindow.put(in, i);
                lastOccurrence.put(i, in);
                // make sure the size satisfies the requirement
                if (inWindow.size() > k) { 
                    int first = lastOccurrence.firstKey();
                    char out = lastOccurrence.get(first);
                    inWindow.remove(out);
                    lastOccurrence.remove(first);
                    j = first + 1;
                }
                max = Math.max(max, i - j + 1);
            }
            return max;
        }
    

  • 0
    R
    This post is deleted!

  • 0
    X

    @yus067 Why we cannot keep a left pointer if the input is a stream instead of a string? Thanks


  • 0
    R

    Thanks for your post.
    I think LRU cache can do this in O(n) time. Below is my implementation.

    public class Solution {
    Map<Character, Node> map;
    Node head; Node tail;
    public int lengthOfLongestSubstringKDistinct(String s, int k) {
        map = new HashMap<>();
        int longest = 0;
        int start = 0;
        for (int i = 0; i < s.length(); i++) {
            char curr = s.charAt(i);
            if (map.containsKey(curr)) { // update node and its pos
                Node node = map.get(curr);
                node.pos = i;
                remove(node);
                append(node);
            } else {
                Node node = new Node(curr, i);
                append(node);
            }
            if (map.size() > k) {
                start = head.pos + 1;
                remove(head);
            }
            longest = Math.max(i - start + 1, longest);
        }
        return longest;
    }
    private Node remove(Node node) {
        map.remove(node.c);
        if (node.next != null) {
            node.next.prev = node.prev;
        }
        if (node.prev != null) {
            node.prev.next = node.next;
        }
        if (node == head) {
            head = node.next;
        }
        if (node == tail) {
            tail = node.prev;
        }
        node.prev = null; node.next = null;
        return node;
    }
    private Node append(Node node) {
        map.put(node.c, node);
        if (tail == null) {
            head = tail = node;
        } else {
            tail.next = node;
            node.prev = tail;
            tail = node;
        }
        return node;
    }
    }
    class Node {
    Node next;
    Node prev;
    int pos;
    char c;
    Node(char c, int pos) {
        this.c = c;
        this.pos = pos;
    }
    

    }


  • 0
    G

    @yus067 So you mean this solution can save space, compared to storing the char sequence in a deque?


  • 2
    M

    This is a nice solution and probably deserves more up votes.

    I would say using a linkedHashMap data structure makes the performance O(n).

    public int lengthOfLongestSubstringKDistinct(String s, int k) {
            int lo = 0, hi = 0;
            int n = s.length();
            int max = 0;
            if (k == 0) return 0;
            LinkedHashMap<Character, Integer> map = new LinkedHashMap<>();
            
            while (hi < n) {
                char ch = s.charAt(hi);
                if (map.containsKey(ch) || map.size() < k) {
                    map.remove(ch);
                    map.put(ch, hi);
                    max = Math.max(max, hi++ - lo + 1);
                }
                else {
                    Character key = map.keySet().iterator().next();
                    lo = map.get(key);
                    map.remove(key);
                    lo++;
                }
            }
            return max;
        }
    
    

  • 0
    L

    Which one is better? LinkedHashMap vs. TreeMap+HashMap


  • 0
    M

    @lionellijian
    I would say it is O(n) performance for linkedHashMap


  • 0

    Why the complexity is O(n)? I think it;s O(nlgn)..


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.