Given a string, find the longest substring T that contains m distinct characters. (need O(n) solution)


  • 1
    F

    String findLongestSubString(String s, int m)
    // abbccdd 3 -> bbccdd
    // abababccccd 3 -> abababcccc


  • 5

    This is a classic two-pointers (slow & fast) plus hash map problem, we can let the fast pointer j go through the string and count the characters as long as the size of the map <= m, otherwise, let the slow pointer i catch up with j till the size of the map < m.

    Following is Java solution:

    public int longestSubstring(String s, int M) {
      int i = 0, j = 0, max = 0;
      
      Map<Character, Integer> map = new HashMap<>();
      
      while (j < s.length()) {
        char cj = s.charAt(j++);
        
        if (map.containsKey(cj)) {
          // current character is in the map already
          // update its count
          map.put(cj, map.get(cj) + 1);
        } else {
          // otherwise, use the slow pointer i to
          // decrease the count of old characters
          // till we can put the new character in the map
          while (map.size() >= M) {
            char ci = s.charAt(i++);
            
            map.put(ci, map.get(ci) - 1);
            if (map.get(ci) == 0) {
              map.remove(ci);
            }
          }
          
          map.put(cj, 1);
        }
        
        // update max length every time we move j
        max = Math.max(max, j - i);
      }
      
      return max;
    }
    

    Time complexity is O(n), but since we move i one by one to decrease the count and once the count is 0 we will remove that character from the map, it can be slow if we have a super long string. To make it even faster, we can borrow LRU data structure to help us quickly know the character we need to remove from the map in O(1) time, the implementation is a bit complex.


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.