```
glob_min = [999999999]
def dfs(n, i):
if n == 0:
glob_min[0] = min(glob_min[0], i)
return
for r in range(int(n**0.5), 0, -1):
if i+1 < glob_min[0]:
dfs(n%(r**2), i + n/(r**2))
return
dfs(n, 0)
return glob_min[0]
```

Initially ran into TLE error because search space was too large. Then, came up with major efficiency improvement reducing size of search space greatly by using `n%(r**2)`

and `i + n/(r**2)`

rather than `n-r**2`

and `i + 1`

. Note that `i`

is the number of squares needed to reduce the original input number to `n`

.