1. Problem
The problem is
Given an array of n integers where n > 1, nums, return an array output such that output[i] is equal to the product of all the elements of nums except nums[i].
2. Thinking process
We assume the input array is {a(1), a(2),..., a(n)} (n ≥ 1), and the output array is {d(1), d(2),..., d(n)}.
2.1 Trivial approach

A variable X is used for saving the multiplication result. At the beginning , X = 1.

In loop i (1 ≤ i ≤ n), the program visit a(k)(1 ≤ k ≤ n, k ≠ i), and save X × a(k) to X.
In total, the program will visit n(n  1) times! That's too much.
2.2 New idea
If we focus on the output array : d(1), d(2),..., d(n) (assuming n > 2), we can get:
d(i) = [1] × [a(2) × ... × a(n)] (i = 1)
d(i) = [a(1) × ... × a(n  1)] × [1] (i = n)
d(i) = [a(1) × ... × a(i  1)] × [a(i + 1) × ... × a(n)] (1 < i < n)
If n > 2, d(i) can be divided in to 2 parts, the left (called L(i)), and the right (called R(i)). We have
d(i) = L(i) × R(i)
The left
L(i) = 1 (i = 1)
L(i) = a(1) × ... × a(i  1) (1 < i ≤ n)
The right
R(i) = a(i + 1) × ... × a(n) (1 ≤ i < n)
R(i) = 1 (i = n)
Here, I use an array P to save L, let
P(i) = L(i + 1) (1 ≤ i < n) (P(n) is not used)
Also, I use an array Q to save R, let
Q(i + 1) = R(i) (1 ≤ i < n) (Q(1) is not used)
For example,
if the input is [a, b, c, d]
then
P is [a, ab, abc, abcd]
Q is [abcd, bcd, cd, d] (abcd in both P and Q are not used)
The output
D(i) = Q(i + 1) (i = 1)
D(i) = P(i  1) × Q(i + 1) (1 < i < n)
D(i) = P(i  1) (i = n)
2.3 Time Complexity
Here, we assume i is from 0 to n  1. The algorithm can be divided into 3 steps.
Step 1. Setting P(i) = a(i), Q(i) = a(i), and do iteration from i = 0 to n  1. (1 pass, 2n visits)
Step 2. Using P(i) = P(i) × P(i  1), Q(n  i  1) = Q(n  i  1) × Q(n  i), and do iteration from i = 0 to n  1, P and Q are finished. (1 pass, 2n visits)
Step 3. Setting D(i). (1 pass, 3n visits)
The time complexity is 2n + 2n + 3n = 7n = O(n).
The algorithm meets the time complexity requirement for the problem, but P and Q are extra space.
4. Optimization
Here, I try to optimize the extra space usage. Assume the output array is 'ans'.
For Step 1 in 2.3,
Since the input of the productExceptSelf function is a vector reference, the input array nums can be used to save P. (no need to initialize P)
Since the output array does not count as extra space for the purpose of space complexity analysis, the output array 'ans' can be used to save Q.
Before saving P in nums, we use nums to initialize the output array ans (Step 1 in 2.3 is finished).
Then, do Step 2 in 2.3.
Now , P[i] is in nums[i], and Q[i] is in ans[i].
For Step 3 in 2.3, since the output
D(i) = Q(i + 1) (i = 1)
D(i) = P(i  1) × Q(i + 1) (1 < i < n)
D(i) = P(i  1) (i = n)
Notice that
Q(k) (k ≤ i) is NEVER USED in calculating D(i).
If we do iteration by increasing i from 0 to n  1,
ans[i] can be used to save D(i).
Without any extra space, the time complexity keeps the same.
Since 'ans' is the output array, after the iteration, the result is generated automatically.
5. Code
class Solution {
public:
vector<int> productExceptSelf(vector<int>& nums) {
int size = nums.size();
//initialize Q
vector<int> ans = vector<int>(nums);
if(size == 0) return ans;
if(size == 1) return nums;
if(size == 2)
{
int temp = nums[0];
nums[0] = nums[1];
nums[1] = temp;
return nums;
}else{
//generate P and Q
for(int i = 1; i < size; i++)
{
nums[i] *= nums[i  1];
ans[size  i  1] *= ans[size  i];
}
//calculate and save D(i) in ans[i]
for(int i = 0; i < size; i++)
{
if(i == 0) ans[i] = ans[i + 1];
if(i == size  1) ans[i] = nums[i  1];
if(i > 0 && i < size  1) ans[i] = nums[i  1] * ans[i + 1];
}
return ans;
}
}
};