We know that the number of permutations is n!. Hence, the maximum number of elements we can handle without overflowing the integer is 12 (12! < Integer.MAX_VALUE). So instead of using a boolean array to store the information about the elements that we have already used, we can just set/unset bits of our integer.

For those who are not too familiar with bit manipulation:

```
(set & (1 << i)) == 0 --------------------- checks if the bit at index i is unset.
alternatively (set & (1 << i)) > 0 -------- checks if the bit at index i is set.
set |= (1 << i) --------------------------- sets the bit at index i to 1.
set &= ~(1 << i) -------------------------- clears the bit at index i to 0.
```

In this case set is a 32-bit integer. One might think that somehow a short (16-bit) can be more efficient than a 32-bit integer, since we only need 12 bits. This is not true, most CPUs are 32-bit based, so having a short of length 16 will simply mask the remaining 16 bits, which doesn't give us any extra value.

```
private List<List<Integer>> res = new ArrayList<>();
public List<List<Integer>> permute(int[] nums) {
int set = 0;
List<Integer> perm = new ArrayList<>(nums.length);
permute(nums, set, perm);
return res;
}
private void permute(int[] nums, int set, List<Integer> perm) {
if (perm.size() == nums.length) {
res.add(new ArrayList<Integer>(perm));
return;
}
for (int i = 0; i < nums.length; i++) {
if ((set & (1 << i)) == 0) {
set |= (1 << i);
perm.add(nums[i]);
permute(nums, set, perm);
set &= ~(1 << i);
perm.remove(perm.size()-1);
}
}
}
```