Regular Expression Matching


@payamrastogi *a is not a valid regex pattern. If anything the solution is simply not validating that the pattern is a correct regex pattern well enough.

@chenlinfeng Cuz ".*" matches "abb", but "d" matches nothing in the text, thus false.

@BeeFarmer, why text = "aab" pattern = "cab" is true? "c" matches nothing in the text.


@Arkkun The time/space complexity discussed is based on the solution presented. Recursions more similar to Approach #2 can of course save space by manipulating indices instead.

When string = "ab" and pattern = ".*c", the testcase failed in recursive version.
It may cause out of bound exception while trying to substring an empty string.
We may need to add empty check in the code.
Here's my modified version.class Solution { public boolean isMatch(String s, String p) { if (p.isEmpty()) return s.isEmpty(); boolean firstMatch = (!s.isEmpty() && (p.charAt(0) == s.charAt(0))  p.charAt(0) == '.'); if (p.length() >= 2 && p.charAt(1) == '*') { return (isMatch(s, p.substring(2))  firstMatch && !s.isEmpty() && isMatch(s.substring(1), p)); } else { return firstMatch && !s.isEmpty() && isMatch(s.substring(1), p.substring(1)); } } }


@charles.cp.tsai The test case "ab", ".*c" passes successfully when I ran it. firstMatch will check that
text
is nonempty, sotext.substring(1)
is defined. The check ofpattern.length() >= 2
will ensurepattern.substring(2)
is defined.

@xiaoyanghaitao @awice I found the problem in my code...
the firstMatch should be
boolean firstMatch = (!s.isEmpty() && (p.charAt(0) == s.charAt(0)  p.charAt(0) == '.'));
not
boolean firstMatch = (!s.isEmpty() && (p.charAt(0) == s.charAt(0))  p.charAt(0) == '.');
Thanks guys' clarification :)

Approach #1 fails in Python3 for "aaaaaaaaaaaaab" "aaaaaaaaaac" because of Time Limit Exceeded
I have made an optiomisation by simplifying the pattern (just remove all the * (a*.* = .) close to the . and then remove duplicated * (aa* = a*))
def simplifyRegexp(self, p): pointer = 0 while (pointer < len(p)1): if (p[pointer] == "." and p[pointer+1] == "*"): while ((pointer+3) < len(p) and p[pointer+3] == "*"): p = p[:pointer+2] + p[pointer+4:] begin = pointer while ((begin) > 0 and p[begin1] == "*"): begin = 2 p = p[:begin] + p[pointer:] pointer += 1 pointer = 0 while (pointer < len(p)1): if (p[pointer+1] == "*"): while ((pointer+3) < len(p) and p[pointer+3] == "*" and p[pointer+2] == p[pointer]): p = p[:pointer+2] + p[pointer+4:] pointer += 1 return p
This is not fuly optimise but at least it passes.

DP solution, I believe to be O(n^3)
class Solution {
public:
bool isMatch(string s, string p) {
vector<string> chunk;
chunk.clear();
string tmp="";
for (int i=0;i<p.size();)
if (i+1<p.size()&&p[i+1]==''){
if (tmp!="") chunk.push_back(tmp);
tmp="";
tmp+=p[i++];
tmp+=p[i++];
chunk.push_back(tmp);
tmp="";
}
else tmp+=p[i++];
if (tmp!="") chunk.push_back(tmp);
/
将匹配串p按切开
abaca . cab c* bb.b a* b* c* a.bac
.可以匹配空串到任意长度任意串
?（?≠.）可以匹配空串到任意长度同样字符
切开后为chunk[0],chunk[1],...,chunk[m1]
令dp[i][j]代表chunk[0..i1]和s[0..j1]的匹配性，结果输出为dp[m][n]dp[i][j]=若chunk[i1]为"?*" 向前搜寻0~尽可能多个(k个)?，若dp[i1][jk]出现true则true，若都没有true则false 若chunk[i1]为".*" 向前搜寻0~任意长k串，若dp[i1][jk]出现true则true，若都没有true则false 将s[jchunk[i1].size()..j1]和chunk[i1]匹配。成功则dp[i][j]=dp[i1][jchunk[i1].size()] 边界条件为 dp[0][0]=true 若前数个chunk[0],..,chunk[i1]都是".*"/"?*" 则dp[1][0],..,dp[i][0]也是true 其余边界均为false */ int m=chunk.size()+1,n=s.size()+1; bool **dp=new bool *[m]; for (int i=0;i<m;++i) dp[i]=new bool[n]; dp[0][0]=true; for (int j=1;j<n;++j) dp[0][j]=false; for (int i=1;i<m;++i) if (dp[i1][0]) dp[i][0]=chunk[i1].size()==2&&chunk[i1][1]=='*'; else dp[i][0]=false; for (int i=1;i<m;++i) for (int j=1;j<n;++j) if (chunk[i1]==".*"){ dp[i][j]=false; for (int k=0;k<=j;++k) if (dp[i1][jk]){ dp[i][j]=true; break; } } else if (chunk[i1].size()==2&&chunk[i1][1]=='*'){ dp[i][j]=dp[i1][j]; for (int k=1;!dp[i][j]&&k<=j&&s[jk]==chunk[i1][0];++k) if (dp[i1][jk]){ dp[i][j]=true; break; } } else{ if (chunk[i1].size()>j){ dp[i][j]=false; continue; } dp[i][j]=dp[i1][jchunk[i1].size()]; if (dp[i][j]==false) continue; int posx=chunk[i1].size()1,posy=j1; while (posx>=0&&posy>=0&&(chunk[i1][posx]==s[posy]chunk[i1][posx]=='.')) posx,posy; if (posx>=0) dp[i][j]=false; } return dp[m1][n1]; }
};