# Java beats 97%: moving window and REAL bits manipulation

• use 'rolling hash' and moving window.

Since DNA only contains ACGT, we can use only 2 bits to represent one char.

Define:
A = 00, C = 01, G = 10, T = 11

Then a 10-char DNA sequence can be represented by a unique 20-bit number.

Since we only consider 10-char DNA sequence, we can use 'moving window' strategy. each time we move one step forward the window, remove the last char and append the new char to it:

`code = (code & mask) << 2 | map[s.charAt(i) - 'A']`

where :

• `& mask` cause a removal of the top char
• `<< 2` moves the window one step
• `| map[s.charAt(i) - 'A']` append the new char to the end.
``````public class Solution {
public List<String> findRepeatedDnaSequences(String s) {
List<String> res = new ArrayList<>();
if (s.length() < 11) return res;
char[] map = new char[26];
map['C' - 'A'] = 1;
map['G' - 'A'] = 2;
map['T' - 'A'] = 3;
// 0011 1111 1111 1111 1111
Set<Integer> seen = new HashSet<>();
Set<Integer> repeat = new HashSet<>();

// init
int code = 0;
for (int i = 0; i < 9; i ++) {
code <<= 2;
code |= map[s.charAt(i) - 'A'];
}

for (int i = 9; i < s.length(); i ++) {
code = (code & mask) << 2 | map[s.charAt(i) - 'A'];
res.add(s.substring(i - 9, i + 1));
}
}

return res;
}
}
``````

• This solution is so brilliant!!

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.