For this solution, I will use an example to explain.

Some observations can be found as follows:

- For s="baabacba" & t="aba", in order to have a solution for the first 'a' in
*t*, there must exists a 'a'.

As for the second 'a' in t, there must exists a 'a' prior to the second 'a' and a 'b' after the first 'a'. - The first 'b' in s is redundant event though t contains 'b'.
- Suppose that s="aab" & t="ab", it is obvious that there are only 2 solutions, and that solution is based on the solutions of s="aa" & t="a".

If s="aabb" & t="ab", the result is twice the result of s="aa" & t="a".

Therefore, the result of a matching character is the current result adds up with the result of the element before.

Combining the observations above, here is the solution.

```
class Solution(object):
def numDistinct(self, s, t):
if not t:
return 1
nLen = len(t)
table = [0 for i in range(nLen + 1)]
idxes = dict()
table[-1] = 1
cur = 0
for i in range(len(s)):
if cur < nLen:
if t[cur] == s[i]:
if t[cur] not in idxes:
idxes[t[cur]] = [cur]
else:
idxes[t[cur]].append(cur)
cur += 1
if s[i] in idxes:
for j in idxes[s[i]][::-1]:
table[j] += table[j-1]
return table[nLen-1]
```

*table* is the DP which stores the result of each iteration.

The last element is set to 1 for dealing with *cur*==0 condition.

Each element stores the result of corresponding element in t.

*cur* denotes the index of the largest possible matching element (+1).

*idxes*, a dict(), stores all indices (<= i) in *s* where char *s[i]* locates.

Here is a step by step result of how this solution goes.

```
s="baabacba", t="aba"
0123
initial, table =[0001]
i=0, [0001] (cur==1)
i=1, [1001] (cur==1) (add table[-1] to table[0])
i=2, [2001] (cur==1) (add table[-1] to table[0])
i=3, [2201] (cur==2) (add table[0] to table[1])
i=4, [3221] (cur==3) (add table[-1] to table[0], add table[1] to table[2])
i=5, [3221] (cur==3)
i=6, [3521] (cur==3) (add table[0] to table[1])
i=7, [4571] (cur==3) (add table[-1] to table[0], add table[1] to table[2]
return table[2] which is 7
```

Like other O(n) space & O(mn) time solutions, I use *table* for DP.

It stores the results as we gradually check each element in s.

However, other solutions loop through the whole *table*, even when they are checking the first few elements of *s*. On the other hand, in this solution the *cur* and the *idxes* only check necessary elements.

In other words, *cur* is used for the current matching index, and we don't have to check elements later then *cur*.

Besides of that, a dictionary is used for recording the indices of the elements in *t* that we checked before, and it can further speed up the linear traversal and only check the target indices.

Please feel free to give me suggestions if there is something can be fixed.