Transform everything into Integer (Java Solution)


  • 0
    S
    1. Use an Integer to represent the number of appearance in 10-unit sequence. For example.

    A->4, C->4, G->1, T->1, Then corresponding integer is

    0x04040101

    1. Use an Integer to represent the 10-unit seq. For example.

    map: A->00, C->01,G->10, T->11, then AAAAACCCCC would be

    (only last 20 bits are used): 0x00000155

    1. Use a Map{Integer, Set{Integer}} to connect everything together.

    First find the number of appearance for a seq, Second find its corresponding set. Third find out whether itself integer representation is inside the set.

    !One good thing is you don't need to convert integer back to string!!

    public class Solution {
        public List<String> findRepeatedDnaSequences(String s) {
            List<String> list = new ArrayList<String>();
            Map<Integer, Set{Integer}> map = new HashMap<Integer, Set<Integer>>();
            Set<String> visited = new HashSet<String>();
            int k = 10;
            byte[] appear = new byte[4];
            Arrays.fill(appear, (byte)0);
            for(int i = 0;i < s.length();i++){
                appear[dna2int(s.charAt(i))]++;
                if(i >= k-1){
                    String sample = s.substring(i-k+1,i+1);
                    int key = byte2int(appear);
                    int intSample = seq2int(sample);
                    if(!map.containsKey(key)){
                        Set<Integer> set = new HashSet<Integer>();
                        set.add(intSample);
                        map.put(key,set);
                    }else{
                        Set<Integer> set = map.get(key);
                        if(set.contains(intSample)){
                            if(!visited.contains(sample))
                                visited.add(sample);
                        }else{
                            set.add(intSample);
                            //map.put(key,set);
                        }
                    }
                    appear[dna2int(s.charAt(i-k+1))]--;
                }
            }
            list.addAll(visited);
            return list;
        }
        
        private int dna2int(char a){
            switch(a){
                case 'A':return 0;
                case 'C':return 1;
                case 'G':return 2;
                case 'T':return 3;
                default: return 0;
            }
        }
        
        private int byte2int(byte[] b){
            int x = b[0] << 24 | b[1] << 16 | b[2] << 8 | b[3];
            return x;
        }
        
        private int seq2int(String s){
            int x = 0;
            for(int i = 0;i < s.length();i++){
                x <<= 2;
                switch(s.charAt(i)){
                    case 'A':
                        x |= 0;
                        break;
                    case 'C':
                        x |= 1;
                        break;
                    case 'G':
                        x |= 2;
                        break;
                    case 'T':
                        x |= 3;
                        break;
                    default:
                        break;
                }
            }
            return x;
        }
        
    }

Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.