I wrote the slower table version first, then I realized if I had used a set, I would be able to avoid checking through the whole preceding s1 and s2. The set version did turn out to be significantly faster.

I believe both versions could be further tuned to run faster (for example, you do not really have to store tuples in the set, just an int should be enough, and this may actually eliminate the need for a set. I've not got time to try this though).

The faster set version:

```
def isInterleave(self, s1, s2, s3):
if len(s1)+len(s2)!=len(s3):
return False
match=set([(0,0)]) # here 0 means including no element from s1 or s2. So (0,0) just means s1[:0] and s2[:0] form s3[:0]
# So (k1, k2) will mean s[:k1] and s[:k2] form s3[:k3]
for k3 in xrange(len(s3)):
newMatch=set()
for m in match:
if len(s1)>m[0] and s1[m[0]]==s3[k3]:
newMatch.add((m[0]+1, m[1]))
if len(s2)>m[1] and s2[m[1]]==s3[k3]:
newMatch.add((m[0], m[1]+1))
match=newMatch
return (len(s1), len(s2)) in match
```

The slow table version:

```
def isInterleaveSlow(self, s1, s2, s3):
if len(s1)+len(s2)!=len(s3):
return False
tmpBuf=[[0]*(len(s1)+1) for _ in xrange(len(s2)+1)]
tmpBuf[0][0]=1
for k3 in xrange(len(s3)):
newBuf=[[0]*(len(s1)+1) for _ in xrange(len(s2)+1)]
for k1 in xrange(max(0, k3-len(s2)), min(k3+1, len(s1))):
if s1[k1]==s3[k3]:
newBuf[k3-k1][k1+1]=1 if tmpBuf[k3-k1][k1] else 0
for k2 in xrange(max(0, k3-len(s1)), min(k3+1, len(s2))):
if s2[k2]==s3[k3]:
newBuf[k2+1][k3-k2]=1 if (tmpBuf[k2][k3-k2] or newBuf[k2+1][k3-k2]) else 0
tmpBuf=newBuf
return bool(tmpBuf[-1][-1])
```

Edited:

I found a lot of table solutions posted by other people already...much better than my clumsy table, I shouldn't have to go through layers of k3.

one example is here:

https://leetcode.com/discuss/79642/python-solution-60ms