C++ O(n) simple recursive solution


  • 1

    Exactly, it is O(26 * n) = O(n), as we only have 26 unique characters.

    Precomputed the appear times and last appear position of each character, so we only need O(1) time to get the appear times and separate position for specific substring.

    Each char in the string would be at most used once when separating, so the getLongestSubstr() would be called at most O(4*n) = O(n) times. As a result, it's strictly O(n).

    Thanks to @StefanPochmann . The idea of his code is very neat. I just try to optimize the time complexity by precomputing.

    class Solution {
    private:
    	int getLongestSubstr(int l, int r, int k, vector<vector<int>> &count, vector<vector<int>> &preLoc) {
    		if(l > r) return 0;
    		for(int i = 0; i < 26; i++) {
    			int tot = count[i][r] - count[i][l - 1];
    			if(tot > 0 && tot < k) {
    				return max(getLongestSubstr(l, preLoc[i][r] - 1, k, count, preLoc)
    					,getLongestSubstr(preLoc[i][r] + 1, r, k, count, preLoc));
    			}
    		}
    		return r - l + 1;
    	}
    public:
        int longestSubstring(string s, int k) {
            vector<vector<int>> count(26,vector<int>(s.size() + 1, 0));
            vector<vector<int>> preLoc(26,vector<int>(s.size() + 1, -1));
            for(int i = 0; i < 26; i++) {
            	char c = 'a' + i;
            	count[i][0] = 0;
            	for(int j = 0; j < s.size(); j++) {
            		count[i][j + 1] = count[i][j] + (s[j] == c);
            		preLoc[i][j + 1] = s[j] == c ? j + 1 : preLoc[i][j];
            	}
            }
            return getLongestSubstr(1, s.size(), k, count, preLoc);
        }
    };
    

  • 0
    S

    I don't think this is true: "Each char in the string would be at most used once when separating".
    It is true about the right side (getLongestSubstr(preLoc[i][r] + 1, r, k, count, preLoc)), but not necessarily about the left side (getLongestSubstr(l, preLoc[i][r] - 1, k, count, preLoc)) since charachter i can be the reason for separation in the left side again.

    So I don't think this algorithm is O(n).


  • 0

    @shahabt
    Yes, it is O(26n) because each character can be the reason for separation in the left side for any times.

    Here, for "Each char" I mean each byte in the input string.

    Specific, in each call, preLoc[i][r] is used to separate the string [l..r] to string [l...preLoc[i][r] - 1] and string [preLoc[i][r] + 1, r]. Then, the location preLoc[i][r] will never be used again.

    By this way,I build a binary tree with at most 4n nodes, if regard each call of getLongestSubstr(l,r) as a node. And in worst case, in each getLongestSubstr(), it may search from 0 to 25, so it's O(26). But O(26) = O(1). That's why the algorithm is O(n).

    For example, if a string is length 5, then the partition may be:
    getLongestSubstr(1,5) (mid = 5)
    getLongestSubstr(1,4) (mid = 3) getLongestSubstr(6,5)
    getLongestSubstr(1,2) getLongestSubstr(4,4)
    getLongestSubstr(1,1)

    getLongestSubstr(l,r) was called 6 times < 4*n = 20 times. In each getLongestSubstr, seach from 0 to 25 cost O(26) = O(1).


  • 0
    S

    Thanks @haiwei624

    Yes, I think you're right. I was missing the fact that processing each node is done in O(1).


  • 0
    Z

    @haiwei624 Yes it is O(n) in time and it uses O(n) extra memory to store the preLoc as well.


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.