Clean Python NFA solution


  • 0
    L

    Very straightforward solution that follows the theory of regular expressions and finite automata.

    First, it constructs the NFA that recognizes the given pattern.

    Then, it finds the epsilon-closure for each node (set of nodes we can reach automatically, without having to consume any characters at all).

    Finally, and this is the mindbendy part, instead of a DFA where we are in just one state, in an NFA we are in many states at once. So we consume the input string char-by-char and make transitions, propagating every one of the many state that we are in.

    class Solution(object):
        def isMatch(self, s, p):
            """
            :type s: str
            :type p: str
            :rtype: bool
            """
            from collections import defaultdict
            
            # Construct the NFA
            out_edges = defaultdict(list)
            cur_node = 0
            prev_char = None
            for cur_char,prev_char in zip(p[1:] + " ", p):
                if prev_char == '*': continue
                if cur_char == '*':
                    next_node = cur_node + 1
                    out_edges[cur_node].append((-1, next_node))
                    out_edges[next_node].append((prev_char, next_node))
                    cur_node = next_node
                else:
                    next_node = cur_node+1
                    out_edges[cur_node].append((prev_char, next_node))
                    cur_node = next_node
            accept_node = cur_node
            
            # Find epsilon closure: set of all nodes accessible through epsilon transitions.
            # Use DFS on each node to find its epsilon closure.
            epsilon_closure = defaultdict(set)
            epsilon_closure[accept_node] = [accept_node]
            def follow(node, parent):
                if node in epsilon_closure[parent]:
                    return
                epsilon_closure[parent].add(node)
                for transition, next_node in out_edges[node]:
                    if transition == -1:
                        follow(next_node, parent)
            for node in out_edges.keys():
                follow(node, node)
            
            # Scan the input string to be matched char-by-char, 
            # maintaining a set of all nodes we are currently in,
            # making transitions in each of them.
            curs = epsilon_closure[0]
            for c in s:
                nexts = []
                for cur in curs:
                    for tomatch, next_node in out_edges[cur]:
                        if tomatch in [c, '.']:
                            # Instead of adding just the target
                            # of this edge in the NFA, we are going
                            # to add the entire set of nodes in its
                            # epsilon closure to the set of states
                            # that we will be in next.
                            nexts += epsilon_closure[next_node]
                curs = set(nexts)
                
            return accept_node in curs
    

Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.