We have #394 Decode String, people were often asked the shortest encode as a followup.


  • 2
    C

    Today I saw some people were asked the encoding as a follow up for leetcode #394 decode string. Most of the them only gave a brief direction during the interview and didn't write any code. But I would like to ask for more thoughts.

    I may think about using dynamic programming. For string s, the shortest encoding from the ith character to the jth character is denoted by dp[i][j], then we have:

    • If i+2 > j, dp[i][j] = s[i-1:j ], i.e. just keep those characters.
    • else If s[i-1:j ] can be represented as repeating patterns, then encode it as repeating_times[repeating_pattern], using the finest repeating pattern.
    • otherwise dp[i][j] = min_length(dp[i][k] + dp[k+1][j]) where i < k < j

    It runs in O(n3) time however, where n is the length of the string to be encoded.
    Does this work? What do you think about it?


  • 0
    C
    class Solution{
    	public static boolean checkRepeating(String s, int l, int r, int start, int end){
    		if( (end-start) % (r-l) != 0 ){
    			return false;
    		}
    		int len = r-l;
    		String pattern = s.substring(l, r);
    		for(int i = start; i +len <= end; i+=len){
    			if(!pattern.equals(s.substring(i, i+len))){
    				return false;
    			}
    		}
    		return true;
    	}
    
    	public static int getLength(int plen, int slen){
    		return (int)(Math.log10(slen/plen)+1);
    	}
    
    	public static String shortestEncodeString(String s){
    		int len = s.length();
    		int[][] dp = new int[len+1][len+1];
    		for(int i = 1; i <= len; ++i){
    			for(int j = 0; j < i; ++j){
    				dp[j][i] = i-j;
    			}
    		}
    
    		Map<String, String> dpMap = new HashMap<>();
    
    		for(int i = 1; i <= len; ++i){
    			for(int j = i-1; j >= 0; --j){
    				String temp = s.substring(j, i);
    				if(dpMap.containsKey(temp)){
    					dp[j][i] = dpMap.get(temp).length();
    					continue;
    				}
    				String ans = temp;
    				for(int k = j+1; k <= i; ++k){
    					String leftStr = s.substring(j, k);
    					String rightStr = s.substring(k, i);
    					if(dp[j][i] > dp[j][k] + dp[k][i]){
    						dp[j][i] = dp[j][k] + dp[k][i];
    						ans = dpMap.get(leftStr) + dpMap.get(rightStr);
    					}
    					if( checkRepeating(s, j, k, k, i) 
    						&& ( 2 + getLength(k-j, i-j) + dp[j][k] < dp[j][i]) ){
    						dp[j][i] = 2 + getLength(k-j, i-j) + dp[j][k];
    						ans = String.valueOf((i-j)/(k-j)) +"["+ dpMap.get(leftStr) +"]";
    					}
    				}
    				dpMap.put(temp,ans);
    			}
    		}
    		return dpMap.get(s);
    	}
    
    	public static void main(String[] arg){
    		System.out.println( shortestEncodeString(arg[0]) );
    	}
    }
    

    A java version for implementation. notice that I use different index for string.
    I'm still trying to put helpful comment on it.
    The idea is refer from:
    https://discuss.leetcode.com/topic/68275/can-we-do-the-encode-way-i-post-one-the-return-is-the-shortest-length-of-encoded-string


  • 0
    C
    This post is deleted!

  • 0
    C

    @catinclay Thanks! I ran your code and it seems to work.


  • 0
    C

    @catinclay I wrote a similar top-down approach combining KMP.

    def encode(s):
        def check_repeating(s):
            sz = len(s)
            lps = [0] * sz
            for i in range(1, sz):
                l = lps[i-1]
                while l and s[l] != s[i]:
                    l = lps[l-1]
                lps[i] = l + (s[l] == s[i])
            if lps[-1] and sz % (sz-lps[-1]) == 0:
                p = sz - lps[-1]
                return sz/p, s[:p]
            return 1, s
    
        memo = {}
    
        def _encode(s):
            if s not in memo:
                if len(s) <= 2:
                    return s
                can = s
                for m in range(1, len(s)):
                    l_cnt, l_str = check_repeating(s[:m])
                    r_cnt, r_str = check_repeating(s[m:])
                    if l_str == r_str:
                        comb = '{}[{}]'.format(l_cnt+r_cnt, _encode(l_str))
                    else:
                        comb = _encode(s[:m])+_encode(s[m:])
                    can = min(can, comb, key=len)
                memo[s] = can
            return memo[s]
    
        return _encode(s)
    
    

Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.