share my C++ solution : 0~3ms, DFS with preprocessing

  • 1

    I am not sure if anyone came up with the same thing before, but it is original.
    The idea is to analyze the input string first and divide it into 5 parts:

    1. prefix
    2. shortest substring where right parentheses are more than left parentheses with maximum difference.
    3. substring in the middle where parentheses are exactly paired.
    4. shortest substring where left parentheses are more than right parentheses with maximum difference.
    5. suffix

    Prefix part is the substring from the beginning until the first left parentheses with right parentheses are pruned. For example:

    )))aaaa)))(...... can be partitioned into aaaa | (,,,,,,, and the prefix is 'aaaa'

    Because the the right parentheses cannot be a match in any case, we can remove them and generate the prefix string.

    So the same is the suffix is the suffix:

    ....)(a(b(c ---> the suffix is 'abc'

    Now the problem is narrowed down to a string starting with '(' and ending with ')'; otherwise a empty string, where the string 'prefix + suffix' is the only element we need to return.

    Next, we find the maximum difference on '(' and ')' count in string to form the part 2, and part4. the remaining substring is part 3. We cannot remove any character in part 3 because it is already matched; removing any pair in part 3 would violate the problem definition: minimum number of parentheses to remove.

    Note that the part 2 and 4 requires to be "shortest" with "maximum" difference on count. For example:

    ())()((()    --> ())  | () |  ((()
                     2      3      4

    the maximum difference is 1 in part 2, i.e. ')' is more than '(' by 1. Another possible substring is ())(), which has difference 1, too, but shortest string make sure the self paired is ranged in part 3 instead of part 2(or part 4).

    After the preprocessing, all we have to do is prune excessive number of right parentheses in part 2., and excessive number of left parentheses in part 4.

    It indeed saves tons of time when the problem is large. Basically the preprocessing scans the input string only once, and knows how many the parentheses to remove and where they are ranged.

    The following code realized the idea. Unfortunately we have lots of stuff to do, so the code is not so neat or clean. Focus the computation of the following variable if you are interested in how to figure out part 2~4:

     int lCnt, rCnt, maxDiff;
            lCnt = rCnt = maxDiff = 0;
            int lBound, rBound; // lBound & rBound is going to position the self-complete string in the middle

    The code:

    class Solution {
        const static char LP = '(';
        const static char RP = ')';
        vector<string> removeInvalidParentheses(string s) {
            vector<string> ret;
            int i,j,k, prefIdx, suffIdx;
            for(i = prefIdx = 0; i < s.size() && s[i]!=LP;++i) if(s[i] != RP) s[prefIdx++] = s[i];
            j = suffIdx = s.size();
            while(--j >i && s[j] != RP) if(s[j] != LP) s[--suffIdx] = s[j];
            int lCnt, rCnt, maxDiff;
            lCnt = rCnt = maxDiff = 0;
            int lBound, rBound; // lBound & rBound is going to position the self-complete string in the middle
            for(k = lBound = rBound = i;k<=j;++k) {
                if(s[k] == LP) ++lCnt;
                else if(s[k] == RP) {
                    if(rCnt - lCnt > maxDiff) maxDiff = rCnt - lCnt, lBound = rBound = k+1;
                    else if(rCnt -lCnt == maxDiff) rBound = k+1;
                else if(rCnt - lCnt == maxDiff) rBound = k+1;
            string body(s.begin()+lBound, s.begin()+rBound);
            vector<string> lhs, rhs;
            char* buf = (char*)malloc(s.size());
            buf[lBound-i-maxDiff] = '\0';
            collect<LP,RP>(s.begin()+i, s.begin()+lBound, maxDiff,0,buf, buf,lhs);
            buf[j-rBound+1-maxDiff + rCnt - lCnt] = '\0';
            collect<RP,LP>(s.rend()-j-1, s.rend()-rBound, maxDiff - (rCnt-lCnt),0,buf, buf,rhs);
            for(const auto& v1 : lhs) for(const auto& v2 : rhs) {
                auto& out = ret.back();
                out += v1, out += body, out.insert(out.end(),v2.rbegin(),v2.rend());
            return ret;
        template<char LEAD, char FOLLOW, class Iter>
        bool collect(Iter b, Iter e, int toSkip, int diff, char* outCp, const char* buffer, vector<string>& result) {
                if(toSkip) return false;
                return true;
            if(*b ==FOLLOW)  {
                auto it = b+1;
                while(it!=e && *it == FOLLOW) ++it;
                int cnt = it-b;
                int hi = min(diff, cnt);
                int lo  =max(0,cnt-toSkip);
                for(int i=0;i<hi;++i) *outCp++ = FOLLOW;
                while(hi>=lo) {
                return hi < min(diff,cnt);
            do {
                if(*b == LEAD) ++diff;
                *outCp++ = *b++;
            }while(b!=e && *b != FOLLOW);
            return collect<LEAD,FOLLOW>(b,e,toSkip,diff,outCp,buffer,result);

  • 0

    @leo.lai.944 It seems we have the same idea. Good solution!

Log in to reply

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.