# Static DP, C++ 12 ms, Python 172 ms, Ruby 384 ms

• There are so many "large" test cases that it's worthwhile to keep data between test cases rather than recomputing from scratch all the time. At least in the slower languages. My `dp` tells the numbers of squares needed for the first integers, and when asked about a new `n`, I extend `dp` just as much as necessary.

C++ ... 28 ms

``````int numSquares(int n) {
static vector<int> dp {0};
while (dp.size() <= n) {
int m = dp.size(), squares = INT_MAX;
for (int i=1; i*i<=m; ++i)
squares = min(squares, dp[m-i*i] + 1);
dp.push_back(squares);
}
return dp[n];
}
``````

C++ ... 12 ms

Switching the loops makes it less nice but faster:

``````int numSquares(int n) {
static vector<int> dp {0};
int m = dp.size();
dp.resize(max(m, n+1), INT_MAX);
for (int i=1, i2; (i2 = i*i)<=n; ++i)
for (int j=max(m, i2); j<=n; ++j)
if (dp[j] > dp[j-i2] + 1)
dp[j] = dp[j-i2] + 1;
return dp[n];
}
``````

Python ... 172 ms

``````class Solution(object):
_dp = [0]
def numSquares(self, n):
dp = self._dp
while len(dp) <= n:
dp += min(dp[-i*i] for i in range(1, int(len(dp)**0.5+1))) + 1,
return dp[n]
``````

Ruby ... 384 ms

``````\$dp = [0]
def num_squares(n)
\$dp << (1..\$dp.size**0.5).map { |i| \$dp[-i*i] }.min + 1 until \$dp[n]
\$dp[n]
end
``````

There's probably a cleaner way than using a global variable, but I'm new to Ruby and don't know one.

• Hi, Stefan. I run your Python code on some examples but am still a little bit confused by the `dp[-i*i]`. Could you give some explanations of it?

• It's the same algorithm as the first C++ solution. I compute the next value to append to `dp`, i.e., how many squares I need for the number `len(dp)`. Python supports indexing from the end with negative indexes, so the already computed `dp[-1]` tells me how many squares I need for the previous number (1 smaller than the number I'm handling now), `dp[-4]` tells me how many squares I need for the number that's 4 smaller than the number I'm handling now, etc.

Does this make it clear?

• Hi Stefan, so what your "static" here does is keeping some data between test cases, right?
Just confused why static keyword can do this.

• Hi, @Stefan. Thank you for your nice explanations. The indexing is really nice. I finally got it :-)

• @wyc25013 Yes, the `static` keyword makes the variable keep its state between calls of the function (and it's only initialized once).

• Hi Stefan.
I want to ask a question. Why it costs 300+ ms when I write "vector<int> dp" not "static vector<int> dp". I don't know the reason.

• It sounds reasonable to use a static variable to improve performance in this particular case. But, what if the scales of test cases are not in a monotonic increasing order?

I have tried to declare dp as local variable, and I received a time limit exceed at 8000+.

• Hi, fanzonezy. Well, looking at Stefan's C++ code, if `dp` has already stored the value for the coming `n`, then the condition of the second `for` loop will always be `false` and so after the first `for` loop stops (without doing anything), the stored result is returned.

• Hi, paladin. Stefan has answered this question in the above comment, cited below:

@wyc25013 Yes, the `static` keyword makes the variable keep its state between calls of the function (and it's only initialized once).

• Oh, get it. Thank you

• I had considered giving that one an `if (m < n+1)` test to avoid the unnecessary outer loop run, but decided against it because it didn't make it faster and added two lines and an indentation level. But maybe it would be better because it's clearer? Like this:

``````int numSquares(int n) {
static vector<int> dp {0};
int m = dp.size();
if (m < n+1) {
dp.resize(n+1, INT_MAX);
for (int i=1, i2; (i2 = i*i)<=n; ++i)
for (int j=max(m, i2); j<=n; ++j)
if (dp[j] > dp[j-i2] + 1)
dp[j] = dp[j-i2] + 1;
}
return dp[n];
}
``````

• ha, using static here was a brilliant idea :)

• Hi, I met some problems
Why my code run so slow when using DP? It costs almost 500ms.
That will be great if you can answer my question.
Thank you :)

``````int numSquares(int n) {
vector<int> res(n+1,0);
for(int i=1;i<=n;i++)
{
res[i]=i;
for(int j=1;j*j<=i;j++)
{
res[i]=min(res[i-j*j]+1,res[i]);
}
}
return res[n];
}``````

• Hi, @wei_fy. If you do not use `static` variables, your running time is just what it should be :-) For the advantage of `static` variables, Stefan has given enough explanations in the above comments.

• Get it! Thank you so much!

• oh, @Stefan originally I thought my dp methods was just too slow, so I got the TLE. I didn't even think of reusing the dp vectors! So happy to meet `static` again in this case! I also read your mathematical solutions, thanks for keeping posting what you know and sharing what you think!

• I think this is kind of cheating.

• but it's nice to point it out in an interview

• Hi Stefan,

I came up with the same algorithm
Can we estimate the runtime of the algorithm? It looks like that at each iteration we compute sqrt(i) operations... So it would be sum i*sqrt(i), i=1 to n.

Solved it by using https://www.wolframalpha.com/input/?i=sum+i*sqrt(i),+i%3D1+to+n

It seems like the sum converges to the n-th harmonic number elevated to -(3/2). What does it tell us about the overall complexity? What are your thoughts on this? Thanks. You're the best

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.