Simple Solution : one-pass, using only array (C++ 92ms, Java 16ms)


  • 26

    After reading a few posts on this topic, I guess most of people over-think about this problem.

    IMHO, We don't need to implement any graph theory expressively here; rather, it is sufficient to just check if every two adjacent elements also appears adjacently in the sub-sequences. (and of course, some basic boundary checking is also necessary)

    in C++:

        bool sequenceReconstruction(vector<int>& org, vector<vector<int>>& seqs) {
            if(seqs.empty()) return false;
            vector<int> pos(org.size()+1);
            for(int i=0;i<org.size();++i) pos[org[i]] = i;
            
            vector<char> flags(org.size()+1,0);
            int toMatch = org.size()-1;
            for(const auto& v : seqs) {
                for(int i=0;i<v.size();++i) {
                    if(v[i] <=0 || v[i] >org.size())return false;
                    if(i==0)continue;
                    int x = v[i-1], y = v[i];
                    if(pos[x] >= pos[y]) return false;
                    if(flags[x] == 0 && pos[x]+1 == pos[y]) flags[x] = 1, --toMatch;
                }
            }
            return toMatch == 0;
        }
    

    in Java:

        public boolean sequenceReconstruction(int[] org, int[][] seqs) {
            if(seqs.length == 0) return false; 
            int[] pos = new int[org.length+1];
            for(int i=0;i<org.length;++i) pos[org[i]] = i;
            boolean[] flags = new boolean[org.length+1];
            int toMatch = org.length-1;
            for(int[] v : seqs) {
                for(int i=0;i<v.length;++i) {
                    if(v[i]<=0 || v[i] > org.length)return false;
                    if(i==0)continue;
                    int x = v[i-1], y = v[i];
                    if(pos[x] >= pos[y])return false;
                    if(flags[x] == false && pos[x]+1 == pos[y]) {
                        flags[x] = true;
                        --toMatch;
                    }
                }
            }
            return toMatch == 0;
        }
    

  • 0

    Thanks for the solution as I was thinking about graph which got TLE. I think the optimized solution can achieve only linear time is to take advantage of that fact that the original sequence orgis already given as a parameter, so we will have a reference to reconstruct the sequence. The more interesting question to ask is what if orgis not given and how (if we could) to construct an original sequence with minimum length possible without duplicates? I guess this will be costing more time and probably need to involve graph.


  • 0

    said in Simple Solution : one-pass, using only array (C++ 92ms, Java 16ms):

    vector<char> flags

    For vector<char> flags, it seems that you want to save memory usage by declar char type instead of int. But can you just use vector<bool>? Both bool and charare 1 byte. Or if you really want to save space, maybe bitmap is better by using int[] with size O(N/32).


  • 1

    @zzg_zzm thanks for your reply.
    Why char not int?yes, memory. Why not bool instead? Performance impact. Bool vector has specialized implementation, as you know, which uses more complicated instructions to access elements. That's why I choose char over bool in C++ but use Boolean directly in Java. You can try it out so you can get some empirical evidence, or just compile bool and char vector into assembly to see what happens. :)


  • 0

    @zzg_zzm per your problem extension, I think leetcode OJ has a similar (or virtually the identical) problem called "Alien Dictionary". Oh, it's still doable in linear time , BTW.


  • 0

    @leo.lai.944 Thanks for the reply. Yes, I heard some tricky things about vector<bool> while reading about STL containers (I need to check it out again!).


  • 0
    J

    How can we prove the correctness of the algorithm? Thanks


  • 0
    This post is deleted!

  • 0
    C

    Do you think naming flag as visited would make it easier to understand your thoughts? Or do you mind explaining it more elaborately? Thanks


  • 0

    @jeffery Sorry for late reply and thanks for your questions.

    I am not really confident to give a purely formal proof because Im not really a good speaker, but I will try to show some examples and let you know how I think.

    There are 3 things needed for full proof:

    1. All characters in the super-sequence form the same set of all characters in all subsequences.
    2. the order relation, i.e. "Which characters a character should be placed after," is a transitive relation. (see https://en.wikipedia.org/wiki/Transitive_relation) You can think this property of the fact " a dependency graph is a directed acylic graph.". This property also guarantees that evey given sub-sequence is really a legal one of the given super-sequence.
    3. the super-sequence is unique "if and only if" every two adjacent elements in the super-sequence also appears adjacently in a certain sub-sequence.

    Because I'm not a native English writer, I am gonna skip 1. and 2. and leave them to better resources:
    Leetcode OJ problem algorithm 207: https://leetcode.com/problems/course-schedule/
    Leetcode OJ problem algorithm 210: https://leetcode.com/problems/course-schedule-ii/

    So, let's assume property 1. and 2. stand, and let's start with an sub-sequence example:

    ab, ac, bc,
    And we can establish a dependency graph with 3 vertices a,b,c and three edges b->a, c->b, c->a.

    Note that I put the element with higher order near the graph source and lower order near the graph sink.

    When we do the post-order traversal, the sequence can determine a legal order relation.(please check the OJ problem listed above for correctness of this statement.)

    Now we can talk about "IF" part, with a longer example abcde....z:
    Because all adjacent pairs in the super-sequence also appears in a certain sub-sequence, you will get something like this:
    a<-b<-c<-d<-e......z.
    Oh, yes: there might be some other edges, like a<-e, b<-z....etc. BUT, no matter how many other edges are there, a is guaranteed the lowest(i.e. the first) element in post-order traversal. Now record a, remove a and all its in-edges, and we reduce it to a sub-problem like this
    b<-c<-d<-e......z.
    So now you still have the one and only one lowest element; by proceeding the post-order traversal, you have an unique result: abcde.......z.

    "Only if" part:
    Suppose the dependency graph looks like this:
    a b<-c<-d<-e......z, and abcde.....z is still a legal relation. However you can always find a way to traverse b first before a and produce another legal sequence, making the super-sequence loses uniqueness. What if the broken link appears in the middle instead of tail? Proceed the post-order traversal described in "IF" part and you will pull the broken link to tail eventually.

    Hope this can help you think.


  • 0

    @coder2 Thanks for your reply.

    It is actually not 'visited', more like 'verified' or "matched", though it is still not really precise either.

    Before I gives a explanation, define the predecessor of x for an integer x is the integer right "left" to x int he problem input org.
    (If you know Java, it means org.indexOf(x) == org.indexOf(predecessor of x) + 1).

    (Despite of the data type I uses, I stored only Boolean values in there.)
    An Boolean value flags[x] is true if and only if the integer x and its predecessor are also in appearance in a certain given sub-sequence adjacently with the same relative position (i.e. the order is also predeccessor then x).

    The naming also bothers me; I cannot figure out a precise name for such a complicated meanings so I decides to leave an extremely meaningless name for it so that everyone knows I don't know how to name it LOL.

    Do you have any suggestion? Thanks.


  • 0

    @leo.lai.944 Brilliant! Here's my 10 lines Python Solution based on your idea. It costs 222ms and beats 86%, using multiple sets instead of single array.

    class Solution(object):
        def sequenceReconstruction(self, org, seqs):
            # check each pair of adjacent elements in org also appear in seqs or not.
            opairs = {(org[i], org[i+1]) for i in range(len(org)-1)} if len(org) != 1 else {org[0]}
            odrs = {org[i]: i for i in range(len(org))}
            spairs = {(s[i], s[i+1]) for s in seqs for i in range(len(s)-1)}
            sones = {s[0] for s in seqs if len(s) == 1}
            
            if not opairs.issubset(spairs) and not opairs.issubset(sones): return False
            
            for k in spairs.difference(opairs):
                if (k[0] not in odrs or k[1] not in odrs or odrs[k[0]] > odrs[k[1]]): return False
            for n in sones:
                if n not in odrs: return False
    
            return True
    

  • 0
    D
    This post is deleted!

  • 0
    D

    Very clear way to solve it. Thanks!!!


  • 1
    Y

    https://en.wikipedia.org/wiki/Topological_sorting#Uniqueness

    This proves the logic of this solution.

    Also, can we build a map of (number -> set of following numbers) while scanning the sequences, so that we have O(n) runtime to validate the original order?


  • 2
    M

    @leo.lai.944

    Cannot pass test case
    [1]
    [[],[]]

    just need to add a tiny flag:

    bool sequenceReconstruction(vector<int>& org, vector<vector<int>>& seqs) {
            if(seqs.empty()) return false;
            vector<int> pos(org.size()+1);
            for(int i=0;i<org.size();++i) pos[org[i]] = i;
            
            vector<char> flags(org.size()+1,0);
            int toMatch = org.size()-1;
            bool flag = false;
            for(const auto& v : seqs) {
                for(int i=0;i<v.size();++i) {
                    flag = true;
                    if(v[i] <=0 || v[i] >org.size())return false;
                    if(i==0)continue;
                    int x = v[i-1], y = v[i];
                    if(pos[x] >= pos[y]) return false;
                    if(flags[x] == 0 && pos[x]+1 == pos[y]) flags[x] = 1, --toMatch;
                }
            }
            return toMatch == 0 && flag;
        }
    

  • 0
    V

    @leo-lai-944 I have a question about

    int[] pos = new int[org.length+1];
            for(int i=0;i<org.length;++i) pos[org[i]] = i;
    

    In this piece of your code, aren't you making an assumption that 0 <= org[i] < org.length ?
    How are you making this assumption?
    I can have org = [ 23, 19, 4] and your code wouldn't work because pos array's size is 4 so when you execute pos[23] = 0, it'll throw an IndexOutOfBounds error


  • 0

    @VRAMJI

    In the problem, "org sequence is a permutation of the integers from 1 to n". n is the length of array.


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.