# ACC O(n^3) dynamic programming solution

• Started the contest late so wasn't able to submit during contest time limit, but here's what I was able to submit (with comments) about 15 minutes after the end. Maybe there's a faster way but haven't thought of one yet.

https://github.com/andreimaximov/algorithms/blob/master/leetcode/algorithms/encode-string-with-shortest-length/solution.py

Solution #3 Inspired by @StefanPochmann 's solution here. The main challenge with solution #2 and #3 is efficiently finding a repeating prefix.

``````class Solution(object):
def encode(self, s):
"""Encodes a string with the shortest possible length in O(n^3) time.

See problem description for encoding rules. Credit to Stefan Pochmann
for finding optimal repeating prefix substring. See his original
solution at https://discuss.leetcode.com/topic/71442/short-python.
"""
n = len(s)

# Each dp[x] represents the shortest encoding for a substring x of s.
dp = dict()
return self.encodeUtil(s, dp)

def encodeUtil(self, s, dp):
"""Calculates the shortest encoding for s.

There are O(n^2) subproblems but each subproblem considers O(n) smaller
subproblems so the overall runtime is O(n^3)."""
n = len(s)
if n < 5:
# Cannot compress anything shorter than 5 characters.
return s
elif s in dp:
return dp[s]

# Encode the string raw by default.
encoding = s

# Encode the string as the shortest repeating substring if possible.
i = (s + s).find(s, 1)

# Check if a repeating substring exists and it is not s.
if i != -1 and i < n:
repeat = s[:i]
count = n / len(repeat)
encoding = '%d[%s]' % (count, self.encodeUtil(repeat, dp))

# Now consider all possible splits between chars 0 through n - 1.
for i in range(1, n):
left = self.encodeUtil(s[:i], dp)
right = self.encodeUtil(s[i:], dp)
if len(left) + len(right) < len(encoding):
encoding = left + right

dp[s] = encoding
return encoding
``````

Solution #2 (Test case `abaababaab` is bugged)

``````class Solution(object):
def encode(self, s):
"""Encodes a string with the shortest possible length in O(n^3) time.

See problem description for encoding rules.
"""
n = len(s)

# Each dp[i][j] reperesents the shortest encoding for s[i:j + 1].
dp = [[None for _ in range(n)] for _ in range(n)]
return self.encodeUtil(s, 0, n - 1, dp)

def encodeUtil(self, s, i, j, dp):
"""Calculates the shortest encoding for s[i:j + 1].

There are O(n^2) subproblems but each subproblem considers O(n) smaller
subproblems so the overall runtime is O(n^3)."""
n = j - i + 1
if n < 5:
# Cannot compress anything shorter than 5 characters.
return s[i:j + 1]
elif dp[i][j] is not None:
return dp[i][j]

# Default encoding is just the raw string.
encoding = s[i:j + 1]

# Consider all repeating prefix substrings.
k = i
while k < j:
# Count how many times x occurs back to back in s[i:j + 1].
x = s[i:k + 1]
y = self.encodeUtil(s, i, k, dp)
count = 0
start = i

while s[start:start + len(x)] == x:
count += 1
start += len(x)

# Compress the prefix.
prefix = y if count == 1 else '%d[%s]' % (count, y)

# Compress the suffix.
suffix = self.encodeUtil(s, start, j, dp)

# Apply if encoding with x as prefix is shorter.
if len(prefix) + len(suffix) < len(encoding):
encoding = prefix + suffix

# NOTE: The following comments are wrong but I have left them to
# demonstrate  my thought process at the time. This can be fixed with
# k+= 1 but will result in TLE. Main challenge is efficiently finding a
# repeating prefix.
#
# This is critical for speedup! Essentially if we have a prefix x
# that repeats p times, we do not need to consider any prefixes
# formed by repeating x p times. Why? Because without this we will
# be performing unecessary (repeat) calls to get optimal suffix.
k = i + len(x) * count

dp[i][j] = encoding
return encoding
``````

Solution #1 (Test case `aaaaaaaaaabbbaaaaabbb` is bugged)

``````class Solution(object):
def encode(self, s):
"""Encodes a string with the shortest possible length in O(n^3) time.
See problem description for encoding rules.
"""
n = len(s)

# Create an index where all O(n^2) possible substrings begin.
index = dict()
for i in range(n):
for j in range(i, n):
x = s[i:j + 1]
if x not in index:
index[x] = set()

# Each dp[i][j] reperesents the shortest encoding for s[i:j + 1].
dp = [[None for _ in range(n)] for _ in range(n)]
return self.encodeUtil(s, 0, n - 1, index, dp)

def encodeUtil(self, s, i, j, index, dp):
"""Calculates the shortest encoding for s[i:j + 1].
There are O(n^2) subproblems but each subproblem considers O(n) smaller
subproblems so the overall runtime is O(n^3)."""
n = j - i + 1
if n < 5:
# Cannot compress anything shorter than 5 characters.
return s[i:j + 1]
elif dp[i][j] is not None:
return dp[i][j]

# Default encoding is just the raw string.
encoding = s[i:j + 1]

# Consider all repeating prefix substrings.
for k in range(i, j):
# Count how many times x occurs back to back in s[i:j + 1].
x = s[i:k + 1]
count = 1
start = i + len(x)
while start in index[x]:
count += 1
start += len(x)

# Compress the prefix.
x = self.encodeUtil(s, i, k, index, dp)
if count > 1:
prefix = '%d[%s]' % (count, x)
else:
prefix = x

# Compress the suffix.
suffix = self.encodeUtil(s, start, j, index, dp)

# Apply prefix and suffix if encoding with x as prefix is shorter.
if len(prefix) + len(suffix) < len(encoding):
encoding = prefix + suffix

dp[i][j] = encoding
return encoding
``````

• @amaximov said in ACC O(n^3) dynamic programming solution:
It gives a false answer for
"aaaaaaaaaabbbaaaaabbb".

I believe it's because a total optimal do not necessarily have an optimal prefix

• @70664914 I get an output of "10[a]bbb5[a]bbb" for that input. Perhaps you were expecting output of 5[a]2[[5[a]bbb]] which is 16 characters long and thus sub-optimal? Another valid encoding would be 5[a]2[aaaaabbb] which is also 15 characters and thus no better.

I can't see any better way to encode it with less than 15 characters.

• @amaximov the answer given has a square non necessary, optimal is 14, I believe

• has a square non necessary

I'm not sure what you mean by this. Please give an example of what you expect a valid 14 character encoding to look like.

• @amaximov `"5[a]2[5[a]bbb]"`, as is given by the machine of leetcode

• @70664914 You're right! Sorry should have checked expected output.

• @70664914 fixed and updated original post.

• Thanks! good solution and clear explanation.

• try test case "abaababaab"

• @roc571 Good catch. The issue is my failure at optimizing the `k = ?` increment step in the while loop. You can replace this with `k += 1` for correctness but then it will TLE.

The issue stems from trying to find the optimal repeating prefix substring. However @StefanPochmann has a great solution here that you can check out.