Binary search solution for follow-up with detailed comments


  • 44

    Re: Java binary search using TreeSet got TLE

    I think the Map and TreeSet could be simplified by Array and binarySearch. Since we scan T from beginning to the end (index itself is in increasing order), List will be sufficient. Then we can use binarySearch to replace with TreeSet ability which is a little overkill for this problem. Here is my solution.

        // Follow-up: O(N) time for pre-processing, O(Mlog?) for each S.
        // Eg-1. s="abc", t="bahbgdca"
        // idx=[a={1,7}, b={0,3}, c={6}]
        //  i=0 ('a'): prev=1
        //  i=1 ('b'): prev=3
        //  i=2 ('c'): prev=6 (return true)
        // Eg-2. s="abc", t="bahgdcb"
        // idx=[a={1}, b={0,6}, c={5}]
        //  i=0 ('a'): prev=1
        //  i=1 ('b'): prev=6
        //  i=2 ('c'): prev=? (return false)
        public boolean isSubsequence(String s, String t) {
            List<Integer>[] idx = new List[256]; // Just for clarity
            for (int i = 0; i < t.length(); i++) {
                if (idx[t.charAt(i)] == null)
                    idx[t.charAt(i)] = new ArrayList<>();
                idx[t.charAt(i)].add(i);
            }
            
            int prev = 0;
            for (int i = 0; i < s.length(); i++) {
                if (idx[s.charAt(i)] == null) return false; // Note: char of S does NOT exist in T causing NPE
                int j = Collections.binarySearch(idx[s.charAt(i)], prev);
                if (j < 0) j = -j - 1;
                if (j == idx[s.charAt(i)].size()) return false;
                prev = idx[s.charAt(i)].get(j) + 1;
            }
            return true;
        }
    

  • 0

    You do think this will work for the follow-up?


  • 9
    W

    @cdai Here is my idea for the follow up, based upon your solution : )

    Binary search:

    • record the indexes for each character in t, if s[i] matches t[j], then s[i+1] should match a character in t with index bigger than j. This can be reduced to find the first element larger than a value in an sorted array (find upper bound), which can be achieved using binary search.

    Trie:

    • For example, if s1 has been matched, s1[last char] matches t[j]. Now, s2 comes, if s1 is a prefix of s2, i.e., s1 == s2.substr[0, i-1], we can start match s2 from s2[i], right?
    • So, the idea is to create a Trie for all string that have been matched so far. At a node, we record the position in t which matches this char represented by the node. Now, for an incoming string s, we first search the longest prefix in the Trie, find the matching position of the last prefix-char in t, say j. Then, we can start matching the first non-prefix-char of s from j+1.
    • Now, if we have done the preprocessing as stated in the binary search approach, we can be even faster.

  • 0
    R

    Hello,
    I didn't quite understand the logic behind the prev variable, could you please explain how exactly it works?
    Thanks!


  • 6

    @root_010

    The prev variable is an index where previous character was picked from the sequence. And for the next character to be picked, you have to select it only after this index is the string 'T'.

    For instance, if S = "abcd" and T = "abdced".
    The index list mapping looks like,

    a -> 0
    b -> 1
    c -> 3
    d -> 2,5
    e -> 4
    

    After you pick a, and b, c will be picked, and index is 3. Now if you have to pick d, you can't pick index 2 because c was picked at 3, so you have to binary search for index which comes after 3. So it returns 5.


  • 2

    @jedihy I think this solution works for follow-up, each time it iterates only the s string, which is relatively small in length, for each s, it runs O(len(s) * log(len(t)))


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.