C++ two concise solutions: hash + two pointers for short string, and hash for super long string


  • 4

    (1) For a string of regular length, it could be fully loaded in memory, so we could access any char in string anytime, and we want to take advantage of that. Here's the O(n) solution:

    int lengthOfLongestSubstringKDistinct(string s, int k) {        
            int l = 0, r = 0;
            unordered_map<char, int> freqs;
            
            for (int n = 0; r < s.length(); r++) {
                if (freqs[s[r]]++ == 0) { n++; }
                if (n > k && --freqs[s[l++]] == 0) { n--; }
            }
            
            return r - l;
    }
    

    Another version of the same idea:

    int lengthOfLongestSubstringKDistinct(string s, int k) {
            int ans = 0, l = 0, r = 0;
            unordered_map<char, int> freqs;
            
            for (; r < s.length(); r++) {
                freqs[s[r]]++;
                if (freqs.size() > k) {
                    ans = max(ans, r - l);              // update ans
                    
                    while (freqs.size() > k) {          // move l until freqs only contains k distinct chars
                        if (--freqs[s[l++]] == 0) { freqs.erase(s[l - 1]); }
                    }
                }
            }
            
            return max(ans, r - l);                     // don't miss the last one
    }
    

    (1) In case of a super long string, memory may not be large enough to load the entire of it, so we'll treat it as a string stream. The downside of it is that we won't be able to access any char any time, and as a result the above solution won't work anymore. One workaround is to store each char's rightmost position, and everytime we want to drop one char from current substring, we can simply drop the char with smallest(leftmost) rightmost position. Also because we have a const=256 different chars, runtime is still O(n). If we want to go even further to deal with unlimited number of different chars, we can use a heap instead, and the runtime will become O(n * logk):

    int lengthOfLongestSubstringKDistinct(string s, int k) {
            int ans = 0, l = 0, r = 0;
            unordered_map<char, int> pos;           // record the rightmost pos of each char
            
            istringstream iss(s);                   // treat string as a string stream
            for (char c; iss >> c; r++) {
                pos[c] = r;
                if (pos.size() > k) {
                    ans = max(ans, r - l);          // update ans
                    
                    auto minIt = pos.begin();
                    for (auto it = pos.begin(); it != pos.end(); it++) {
                        if (it->second < minIt->second) { minIt = it; }     // find the leftmost char
                    }
                    
                    l = minIt->second + 1;          // move l to the new start position
                    pos.erase(minIt);               // get rid of the leftmost char so that pos only contains k distinct chars
                }
            }
            
            return max(ans, r - l);                 // don't miss the last one
    }
    

Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.