The idea is use an array instead of a list to store the trace, which will improve the performance a lot.

```
public List<List<Integer>> permute(int[] nums) {
List<List<Integer>> ans = new ArrayList<>();
if (nums.length > 0)
solve(ans, new int[nums.length], 0, nums, new boolean[nums.length]);
return ans;
}
private void solve(List<List<Integer>> ans, int[] cur, int cnt, int[] nums, boolean[] used) {
if (cnt == nums.length) {
List<Integer> list = new ArrayList<>();
ans.add(list);
for (int i: cur) list.add(i);
return;
}
for (int i = 0; i < nums.length; i++) {
if (!used[i]) {
cur[cnt++] = nums[i];
used[i] = true;
solve(ans, cur, cnt, nums, used);
--cnt;
used[i] = false;
}
}
}
```