4mS C solution with explanation


  • 1
    M

    Approach illustrated here is a typical DFS with memoization. The main challenge to this problem might be implementation of the pure logic to parse those wildcards. Evaluation of '.' seems pretty straightforward, but as mentioned below, '*' has to account for the preceding character.

    '.' Matches any single character.
    '*' Matches zero or more of the preceding element.

    This contextual aspect of ‘*’ would mean that it has to be parsed in light of that preceding element. This means we need to evaluate them together and we can see how this can typically happen in two ways.

    The problem states that it’s valid to have zero or more instances of the character associated with ‘*’. So one DFS path would account for the character while the other simply will not. Note that the path which acknowledges the character would recursively seek for more than one instances also. So this way we would have covered all the possibilities. Such a DFS on the string “ab” with regular expression “. * ..” is illustrated below.

    0_1467814474464_dfs.jpg

    The code for the main recursive parser along with memoization array is given below. Since the parsing is relative to both the string and regular expression offsets, the memoization array is also indexed via a combination of the corresponding offset pairs of string/expression. For example, location [0, 0] would have the final result. While [1, 2] will store the result of evaluating string starting at the offset 2 with the expression starting at the offset 1.

    And here is the Bit Bucket link for the full source

    /***********************************************************************/
    /* is_match: Recursively search for a match!                           */
    /*                                                                     */
    /***********************************************************************/
    int is_match(char *str, int soff, char *exp, int eoff, int slen,
                 int elen, unsigned char *m_arr)
    {
        char s = str[soff], e = exp[eoff];
        int aoff = (eoff * slen) + soff, el = elen - 1, sl = slen - 1;
    
        /* We are done processing */
        if ((soff == sl) && (eoff == el))
            return SUCCESS;
    
        /* If we have the result cached, then return the same */
        if (m_arr[aoff] != 0)
            return m_arr[aoff];
    
        /* Simple case where there is no wildcard '*' after this character */
        if (((e == '.') && (soff != sl) && (exp[eoff + 1] != '*')) ||
            ((e == s) && (exp[eoff + 1] != '*')))
            m_arr[aoff] = is_match(str, soff + 1, exp, eoff + 1, slen, elen,
                                   m_arr);
    
        /* When there is a '*' after a character, then we can parse by either
           ignoring the character or by accounting for it */
        else if ((eoff != el) && (exp[eoff + 1] == '*'))
        {
            /* If the characters match, then we can automatically attempt to
               parse it both ways.
               1. First by ignoring the character preceding '*'.
               2. And then by accounting for that character.
    
               Note that the case where there is a wild-card '.' preceding a
               '*', we can search by accounting for the string character only
               if its valid.
            */
            if (((e == '.') && (soff != sl)) || (e == s))
            {
                /* If both the paths failed, then set status to the same */
                if ((is_match(str, soff, exp, eoff + 2, slen, elen, m_arr) == 2) &&
                    (is_match(str, soff + 1, exp, eoff, slen, elen, m_arr) == 2))
                    m_arr[aoff] = FAILED;
                /* Else we have a winner */
                else
                    m_arr[aoff] = SUCCESS;
            }
    
    
            /* If the characters do not match, then we can anyway ignore
               it based on wild-card '*' */
            else if (e != s)
                m_arr[aoff] = is_match(str, soff, exp, eoff + 2, slen, elen,
                                       m_arr);
        }
    
        /* If the result is still 0, then set to failure */
        if (m_arr[aoff] == 0)
            m_arr[aoff] = FAILED;
    
        /* Return status */
        return m_arr[aoff];
    }
    
    int isMatch(char *str, char *exp)
    {
        int slen, elen, ret;
        unsigned char *m_arr;
    
        /* Maintain Sanity */
        if (!str || !exp)
            return 0;
    
        /* Calculate the length, account for empty strings */
        slen = strlen(str) + 1;
        elen = strlen(exp) + 1 ;
    
        /* Allocate the memoization array, initialized with zeros */
        m_arr = calloc(slen * elen, sizeof(unsigned char));
        if (!m_arr)
            return 0;
    
        /* Evaluate the expression */
        ret = is_match(str, 0, exp, 0, slen, elen, m_arr);
    
        /* Map actual failure code 2 to zero */
        if (ret == FAILED)
            ret = FALSE;
    
        /* Free the array and return */
        free(m_arr);
        return ret;
    }
    
    

    You solution is quite professional but hard to imagine some one can easily get this done in a short interview. Anyway, voting you up for your professional spirits!


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.