# Easy to understand java solution with well commented code.

• This solution is inspired. I worked to understand it myself and commented the code.

``````public List<String> findRepeatedDnaSequences(String s) {
Set<Integer> firstTime = new HashSet<Integer>();
Set<Integer> secondTime = new HashSet<Integer>();
List<String> list = new ArrayList<String>();

char[] map = new char[26];
int len = s.length();

// Hashing function. We have only 4 letters which we can represent by 2 bits.
map['A' - 'A'] = 0; // A = 00
map['C' - 'A'] = 1; // B = 01
map['G' - 'A'] = 2; // C = 10
map['T' - 'A'] = 3; // D = 11

for(int i=0; i<= len - 10; i++)
{
int sequence = 0;
for(int j=i; j< i+10; j++)
{
// Shift existing sequence by two to make space for the new character coming
sequence = sequence << 2;

// Copy the character from the map and paste those two bits in the newly created space. Read bit wise OR.
sequence = sequence | map[s.charAt(j) - 'A'];
}

// For this number to be added in the list, this should be the second time this number is appearing
// For this if condition to be true, firstTime.add() should be false.
// firstTime.add() will be false when there is already the same number present.
// How it will behave?
// First time - firstTime.add(sequence) will return T
// secondTime.add(sequence) will NOT be executed

// First time - firstTime.add(sequence) will return F
{
}
}

return list;
}``````

``````public class Solution {
public List<String> findRepeatedDnaSequences(String s) {
Set<Integer> firstTime = new HashSet<Integer>();
Set<Integer> secondTime = new HashSet<Integer>();
List<String> list = new ArrayList<String>();

for(int i=0; i<= s.length() - 10; i++)
{
int sequence = encoding(s.substring(i, i+10));

{
}
}

return list;
}

public int encoding(String s){
char[] map = new char[26];
int r = 0; // 32 bit int can represent 16 encoded chars
if(s.length() != 10) return -1;
map['A' - 'A'] = 0; // A = 00
map['C' - 'A'] = 1; // C = 01
map['G' - 'A'] = 2; // G = 10
map['T' - 'A'] = 3; // T = 11
for (int i=0; i<s.length(); i++){
int temp = map[s.charAt(i) - 'A'];
r = r<<2;
r = r|temp;
}
return r;
}
}``````

• Building on top of san89kalp's solution, but without using a nested for:

``````public List<String> findRepeatedDnaSequences(String s) {
Set<Integer> firstTime = new HashSet<Integer>();
Set<Integer> secondTime = new HashSet<Integer>();
List<String> list = new ArrayList<String>();

char[] map = new char['Z'];
int len = s.length();
if (len < 10)
return list;

// Hashing function. We have only 4 letters which we can represent by 2 bits.
map['A'] = 0; // A = 00
map['C'] = 1; // B = 01
map['G'] = 2; // C = 10
map['T'] = 3; // D = 11

int sequence = 0;

// Let's compute the hash for the first 10
for(int j=0; j< 10; j++)
{
// Shift existing sequence by two to make space for the new character coming
sequence = sequence << 2;

// Copy the character from the map and paste those two bits in the newly created space. Read bit wise OR.
sequence = sequence | map[s.charAt(j)];
}

for(int i=10; i< len; i++)
{
sequence = sequence & ((1 << 18) - 1); // nillify the 2 MSBs
sequence = sequence << 2; // shift to make room
sequence = sequence | map[s.charAt(i)]; // set the 2 LSBs

String target = s.substring(i-10 + 1, i + 1);