```
vector<vector<string>> partition(string s) {
vector<vector<string>> result;
int size = s.size();
if(!size)
return result;
for(int i = size-1; i >= 0;i--)
{
if(s[i] == s[size-1])
{
int p1 = size-2;
int p2 = i+1;
while(p1>p2)
{
if(s[p1]!=s[p2])
break;
--p1,++p2;
}
if(p1>p2) continue;
else {
vector<vector<string>> subResult = partition(s.substr(0,i));
int subSize = subResult.size();
if(subSize)
for(int j=0;j<subSize;j++)
{
subResult[j].push_back(s.substr(i,size-i));
result.push_back(subResult[j]);
}
else
result.push_back(vector<string>(1,s.substr(i,size-i)));
}
}
}
return result;
}
```

The code above is an accepted one, but costs more than 120ms to execute. I don't know why it would take that much time comparing to the highest-voted back-tacking C++ solution. Can anyone help?

My basic idea is to start from the rear of the string, if I find a char at index i that s[i] == s[size-1], and s.substr(i, size-i) is a palindrome, I'll find all possible palindrome combinations in s.substr(0,i) and push s.substr(i, size-i) at the back of each combination.

I'm a near greenhand and may have asked silly questions, please don't mind. Thanks very much!