I got a TLE for the Python code below, because the time cost of bisect.insort is O(n) for a built-in list.
The code was rejudged as accepted just now, but very slow... 1800ms+
class Solution(object): def maxSumSubmatrix(self, matrix, k): """ :type matrix: List[List[int]] :type k: int :rtype: int """ m = len(matrix) n = len(matrix) if m else 0 M = max(m, n) N = min(m, n) ans = None for x in range(N): sums =  * M for y in range(x, N): slist, num = , 0 for z in range(M): sums[z] += matrix[z][y] if m > n else matrix[y][z] num += sums[z] if num <= k: ans = max(ans, num) i = bisect.bisect_left(slist, num - k) if i != len(slist): ans = max(ans, num - slist[i]) bisect.insort(slist, num) return ans or 0
Could anybody share a more efficient Python solution? Thank you :D
Sorry for that. Because the test cases are quite large, even faster languages like c++ runs slow. So if someone could come up with a better solution, e.g. O(n^3), the codes will run faster.
There is an O(Kn^3) algorithm using pseudo-polynomial subset sum algorithm that may be faster when K is small.
It runs 1506ms. Slightly faster than your implementation but cannot adapt to unbalanced number of row and column. I just assumed the number of columns is much less than that of rows.
class Solution(object): def maxSumSubmatrix(self, matrix, k): maxSum = -9999999 horizontalSum = [[0 for j in xrange(0, len(matrix) + 1)] for i in xrange(0, len(matrix))] for i in xrange(0, len(matrix)): for j in xrange(0, len(matrix)): horizontalSum[i][j] = horizontalSum[i][j - 1] + matrix[i][j] for cola in xrange(0, len(matrix)): for colb in xrange(cola, len(matrix)): bilist, vsum = , 0 for i in xrange(0, len(matrix)): vsumj = horizontalSum[i][colb] - horizontalSum[i][cola - 1] vsum += vsumj i = bisect.bisect_left(bilist, vsum - k) if i < len(bilist): maxSum = max(maxSum, vsum - bilist[i]) bisect.insort(bilist, vsum) return maxSum
Just another implementation of the same algorithm. It's a bit faster (around 1400ms) and maybe a bit clearer (because I separate the 1D-subproblem out):
def maxSumSubmatrix(self, matrix, k): def maxSumSublist(vals): maxSum = float('-inf') prefixSum = 0 prefixSums = [float('inf')] for val in vals: bisect.insort(prefixSums, prefixSum) prefixSum += val i = bisect.bisect_left(prefixSums, prefixSum - k) maxSum = max(maxSum, prefixSum - prefixSums[i]) return maxSum maxSum = float('-inf') columns = zip(*matrix) for left in range(len(columns)): rowSums =  * len(matrix) for column in columns[left:]: rowSums = map(int.__add__, rowSums, column) maxSum = max(maxSum, maxSumSublist(rowSums)) return maxSum
Could handle "wide" matrices with this, though the problem's "Note" rather suggests "tall" matrices so I kept the simpler columns = zip(*matrix).
columns, matrix = sorted((matrix, zip(*matrix)), key=len)
@jedihy nice one, though it took me a while to realized why initially 0 is inserted into the bisect list. thanks for sharing.
Why do we even bother to do bisect.bisect_left and then bisect.bisect_insort if the combo is O(n)? Can't we just use the plain old prefix sum array method to find out the 1-D subarray sum closest to k? Or is that much slower somehow?
I met this question during Google phone interview about one year ago. I think it is crazy to ask this sort of question in any interview.
@jedihy this is still O(n^4) right? the bisect.insort() should be O(n)
@areshand agree with you
I used the same algorithm, but I compared the number of rows and number of columns first, if row > col, I simply transposed the matrix. Runtime is 1348ms.
This is my merge sort version but sadly got TLE. Can anybody help me... TAT
class Solution(object): def maxSumSubmatrix(self, matrix, k): def sort(sums, i, j): if j - i == 1: return ~0x7fffffff mid = i + (j - i) / 2 res = max(sort(sums, i, mid), sort(sums, mid, j)) for r in range(mid, j): l = bisect.bisect_left(sums, sums[r] - k, lo=i, hi=mid) if l < mid: res = max(res, sums[r] - sums[l]) else: break sums[i:j] = sorted(sums[i:j]) return res M, N = len(matrix), len(matrix) if M < N: M, N = N, M res = ~0x7fffffff for L in range(N): sumBelt =  * M for j in range(L, N): accs =  for i in range(M): sumBelt[i] += matrix[i][j] if len(matrix) > len(matrix) else matrix[j][i] accs.append(accs[-1] + sumBelt[i]) res = max(res, sort(accs, 0, M + 1)) return res
@StefanPochmann I'm just looking into the maxSumSublist function and see that it looks like a O(n^2) function because insort is O(n). Doesn't that defeat the whole purpose of using binary search to optimize the largest sum less than k problem and allowing a O(nlogn) solution? Is it because bisect uses a C function that is more optimized? Either way, I can't see any other way to do in Python. https://www.quora.com/Given-an-array-of-integers-A-and-an-integer-k-find-a-subarray-that-contains-the-largest-sum-subject-to-a-constraint-that-the-sum-is-less-than-k
@StefanPochmann I can always learn something from your code :D Thanks for sharing!
Btw, I implemented red-black tree myself, so my time complexity should be like O(n^3 log(n)) but still get TLE with 23/27 passes (I already transpose the matrix when it is tall). Any idea of resolving this issue?
Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.