# Share my O(n) C++ accumulation solution with thinking process and explanation

• ## 1. Problem

Given a binary array, find the maximum length of a contiguous subarray with equal number of 0 and 1.

## 2. Thinking process

#### 2.1 Calculating number 1 in a contiguous subarray

Before introducing the way to calculate number of 0 and 1 in a contiguous subarray, we focus on the number of 0 and 1 in the whole array.

Since the array is a binary array, suppose the array is

#### a(1), a(2), ..., a(n), a(k) = 0 or 1, 1 ≤ k ≤ n.

It can be inferred that

#### The number of 1 in a(n) is S(n) = a(1) + a(2) + ... + a(n).

which has a recursion formula

#### S(n) = a(n) + S(n - 1), n > 1

Now we focus on a contiguous subarray from a(i) to a(j) in a(n).

Suppose the summation of the subarray is

#### T(i, j) = a(i) + a(i + 1) + ... + a(j - 1) + a(j), 1≤ i < j ≤ n.

It can be inferred that

#### T(i, j) = S(j) - S(i - 1), i > 1.

Since the array is a binary array, it can be inferred that

#### 2.2 Calculating number 0 in a contiguous subarray

Suppose U(n) is the number of 0 in the array

#### a(1), a(2), ..., a(n), a(k) = 0 or 1, 1 ≤ k ≤ n.

Then, the number of 0 in a contiguous subarray from a(i) to a(j) in a(n) is

This means

#### T(i, j) = V(i, j)

It can be inferred that

#### S(j) - S(i - 1) = U(j) - U(i - 1), i > 1

The first formula - when i = 1, the subarray starts from the beginning of the array.

We only need to check whether

#### U(j) - S(j) = 0.

The second formula - when i > 1.

We get

#### S(j) - S(i - 1) = U(j) - U(i - 1)

If we change the formula, we get

#### U(j) - S(j) = U(i - 1) - S(i - 1)

If we can find 1≤ i < j ≤ n meets the formulas above, the subarray from a(i) to a(j) will meet the requirement.

#### 2.4 The key point and calculating the maximum length of a required subarray

Till now, we find the key point is to calculate

#### Key(j) = U(j) - S(j) = j - S(j) - S(j) = j - 2 × S(j).

If we do iteration from j = 1 to j = n, and calculate Key(j) and save it to a hash table as key.

There are 2 situations.

#### The maxmium subarray's length is L(Key(j)) - F(Key(j)) + 1 - from a(F(Key(j))) to a(L(Key(j))).

The total maxmium length is

## 3. Algorithm

#### 3.1 Special cases

If the size of array < 2, there will be no subarray that meets the requirement.

• The function returns 0.

#### 3.2 Normal situation

In order to save memory and make the algothrim simple, when doing iteration from j = 1 to j = n,

If

## 4. Complexity analysis

#### 4.1 Time complexity

Since the algothrim is one-pass from 1 to n.

As we know

#### Key(j) = U(j) - S(j) = j - 2 × S(j), 0 ≤ S(j) ≤ j, 0 ≤ j ≤ n.

We can infer that

So

#### - n ≤ Key(j) ≤ n.

That is to say

## 5. Runtime optimization

By using `std::map` as hash table, the runtime is

#### 162ms (beats 41.56% of cpp submissions).

In order to optimize runtime, I use an interger array instead.

Since Key(j) can be negative integer, and negative integer can't be an index of an array,

I can

and

#### The size of the interger array is 2n + 1.

After optimization, the runtime is

## 6. Code

#### 6.1 Using `std::map` [Before Optimization]

``````class Solution {
public:
int findMaxLength(vector<int>& nums) {
if(nums.size() < 2) return 0;
int maxLength = 0;
map<int, int> lowIndex;

//sum-up
int i = 1;
while(true)
{
//number of 1s when <= i: nums[i - 1]
//number of 0s and 1s <= i: i
int diff = i - 2 * nums[i - 1];
if(diff == 0)
{
if(maxLength < i) maxLength = i;
}else{
if(lowIndex.find(diff) == lowIndex.end())
{
lowIndex[diff] = i - 1;
}else{
if(i - 1 - lowIndex[diff] > maxLength) maxLength = i - 1 - lowIndex[diff];
}
}
if(i == nums.size()) return maxLength;
nums[i] += nums[i - 1];
i++;
}
}
};
``````

#### 6.2 Using integer array [After Optimization]

``````class Solution {
public:
int findMaxLength(vector<int>& nums) {
if(nums.size() < 2) return 0;
int maxLength = 0;
int *lowIndex = new int[2 * nums.size() + 1];
for(int i = 0; i < 2 * nums.size() + 1; i++) lowIndex[i] = -1;

//sum-up
int i = 1;
while(true)
{
//number of 1s when <= i: nums[i - 1]
//number of 0s and 1s <= i: i
int diff = i - 2 * nums[i - 1];
if(diff == 0)
{
if(maxLength < i) maxLength = i;
}else{
if(lowIndex[diff + nums.size()] == -1)
{
lowIndex[diff + nums.size()] = i - 1;
}else{
if(i - 1 - lowIndex[diff + nums.size()] > maxLength)
maxLength = i - 1 - lowIndex[diff + nums.size()];
}
}
if(i == nums.size()) return maxLength;
nums[i] += nums[i - 1];
i++;
}
}
};
``````

• Great, thanks! You had me scratching my head for a while at 2.2 when you did not explicitly say U(j) = j - S(j) (because all the elements that are 0 means all the elements all together, except those that are 1.)

I didn't speed up the O(n) of your solution, but I tightened up, In particular there is no need to check for the special case nums,size() < 2, because it will fall out. Also there is no need to treat the array from nums[0] to nums[i] as special, so we can save a conditional. I put a few repeated expressions in const int or int. Especially the size() method on an array may be implemented as a function call.
I don't understand why you are storing the tally (key, imbalance) at each position given that this is single pass, so I store it as scalar variable.
For style I prefer a for loop to a while(true) with a break, and I named some things more explicitly. All together I might be running in half the time as your optimized array solution.

``````    int findMaxLength(vector<int>& nums) {
const int nums_size = nums.size();
const int firsts_size = 2 * nums_size + 2;
//This will fall out if(nums_size < 2) return 0;
int maxLength = 0;
// This array holds the first i where we find a given Key
// The Key is the imbalance. Negative is how many more 0s, Positive is how many more 1s.
// This is normalized to run from 0 to 2 * nums.size() rather than from -nums.size() to nums.size().
// Since we are starting at nums[0], any time Key is 0 the subArray from 0 to i is balanced,
//  and it must be the longest so far because any partial, or any so-far, must be shorter.
//  But that is also lowIndex[0] so no need to special case it
// Index of firstTimeWeSaw is nums_size + actual i where it happened
// Before we add nums[0] is position -1
// cout << "nums.size() = " << nums_size << "  map firsts size = " << firsts_size << endl;
int *firstTimeWeSaw = new int[firsts_size];
for(int i = 0; i < firsts_size; i++)
firstTimeWeSaw[i] = -2;
int i = -1;
int tally = 0; // KJer stores this in nums[i-1]
int diff = 2 * tally - i; // 1 means balanced because of how we count from -1
int key = diff + nums_size;
firstTimeWeSaw[key] = i; // We are balanced before we start
// cout << "Before nums[0]  tally = " << tally << " calculated imbalance (1 + 1s - 0s) is " << diff << " key = " << key << endl;
int length;

for (i = 0; i < nums_size; i++)
{

//number of 1s in nums[] from 0 to i: tally
//number of 0s or 1s in nums[] from 0 to i: i
tally += nums[i];
// diff = 2 * tally - i; // 1 means balanced because of how we count from -1
// cout << "i = "<< i << " tally (total 1s) including nums[" << i << "] (" << nums[i] << ") = "
//              << tally << " imbalance (1 + 1s - 0s) = " << diff  << endl;
key = 2 * tally - i + nums_size;
// Have we seen this key before?
if(firstTimeWeSaw[key] == -2)
{
// cout << "    This is the first time we saw diff " << diff << " (key = " << key << ")\n";
// This is the first time we saw the key
firstTimeWeSaw[key] = i;
}
else
{
// This key looks familiar
// How long since we last saw it?
length = i - firstTimeWeSaw[key];
// cout << "    First saw diff " << diff << " was when i was " << firstTimeWeSaw[key] << "  length = " << length << endl;
if(length > maxLength)
{
// cout << "        New longest length = " << length << " > longest was " << maxLength << endl;
maxLength = length;
}
}
}
return maxLength;
}
``````

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.