3ms solution in C++ with detailed explanation.


  • 0

    Here's my idea inspired by longest increasing subsequence problem:

    There're 3 circumstances as we read a new character, ch, in S:

    • ch cannot expand any existing subsequence, nor can it begin a new one;
    • ch can begin a new subsequence;
    • ch can expand an existing subsequence;

    So, here is the question: as we read a char from S, how can we tell if it is part of a subsequence?

    Here's my solution:

    • If ch does not match any character in T, we just ignore it.
    • If ch can begin a new subsequence, aka it matches T.begin(), we create a new subsequence and put it into the 'pool'.
    • If ch can expand a new subsequence, which means it can match T[i] where i is the size of a current existing subsequence, we COPY the subsequence, append ch to one's back, and put both back into the pool.

    I'll try to explain the process. The most tricky part of this method is the copy process. Why should we copy the subsequence instead of directly appending the character?

    I'll explain this with the example in the question, where S = "rabbbit" and T = "rabbit". Let's jump to when the 2nd 'b' is met, when there's three subsequence in the pool, "rab", "ra", and "r".

    Now we have a new character, ch = 'b'. When we try to expand all the existing subsequences, we will find that the new 'b' fits 2 subsequences: "ra" and "rab". So we copy the 2 subsequences and append 'b' to both, then put them back. Now the pool has 5 subsequences: "rabb", "rab", "rab", "ra", and "r". Notice that "rab" has appeared twice. If we tag each b with different tag, we will find the pool looks like this: "rab1b2", "rab1", "rab2", "ra", and "r".

    Now the third 'b' comes in. this new 'b' will be able to expand 3 existing subsequences, and we copy-and-extend them all. Now the pool looks like: "rab1b2", "rab1b3", "rab2b3", "rab3", "rab1", "rab2", "ra", and "r". The later characters in S will smoothly append to the first 3 subsequences with no question, so the answer is 3.

    See? Each time we met a new character, it makes our pool larger without ignoring the possibility that a later one will occur. At last we can simply count the subsequences in our pool when we reached the end of S.

    Now we can optimize the solution. Notice that all subsequences in the pool is a subarray of T starting at the beginning, we can note only the tail of each subsequence. And as there's a lot of repeating subsequence, we can record each tail's count only. As we copy a subsequence, we just copy its count and add to next one. At last we just read the count of subarray equals to T, the problem will be solved.

    Noticing the count is always 0 before subsequence reaches, I jumped over the 0 ones as there's nothing to copy.

    So here's my solution.

    class Solution {
    public:
        int numDistinct(string s, string t) {
            if (s.size() < t.size())
                return 0;
            if (t.size() == 0)
                return 0;
            // Store subsequences counts.
            vector<int> counter(t.size(), 0);
            int backMostIndex = 0;
            for (auto ch: s) {
                // Search begins from index backwards.
                for (int matching = backMostIndex; matching >= 0; --matching) {
                    if (t[matching] == ch) {
                        if (backMostIndex == matching)
                            backMostIndex += 1;
                        if (matching)
                            counter[matching] += counter[matching - 1];
                        else
                            counter[matching] += 1;
                    }
                }
            }
            return counter.back();
        }
    };
    

Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.