Serialize a list of strings into a single string and Deserialize it back to original list of strings


  • 0
    R

    [Phone Interview] If you are given a list of strings say {"a","","bc,","d"}. Write code for a function to Serialize the list to single string and for a function to deserialize that string back to original list of String {"a","","bc,","d"}. The list may have empty strings


  • 0
    L

    Here is a sample solution for this questions:

    import java.io.*;
    import java.util.ArrayList;
    import java.util.Arrays;
    import java.util.List;
    
    /**
     * Created by anliu on 10/22/16.
     */
    public class SerializeListOfStrings {
    
        public static void main(String[] args) {
            List<String> writeList = new ArrayList<String>();
            writeList.add("This");
            writeList.add("");
            writeList.add("is");
            writeList.add("a");
            writeList.add("book!");
            serialize(writeList);
    
            List<String> readList = deserialize();
    
            Assert.assertTrue("Serialize/Deserialize", writeList.toString(), readList.toString());
    /**
    Serialize/Deserialize
       Output: [This, , is, a, book!]
     Expected: [This, , is, a, book!]
       Result: true
    */
        }
    
        public static void serialize(List<String> list) {
            if(list == null || list.size() == 0) return;;
    
            StringBuilder sb = new StringBuilder();
            for( int i=0; i< list.size() -1 ; i++) {
                sb.append(list.get(i)).append(" ");
            }
            sb.append(list.get(list.size() - 1));
    
            try {
                FileOutputStream fileOut = new FileOutputStream("./test.txt");
                ObjectOutputStream out = new ObjectOutputStream(fileOut);
                out.writeObject(sb.toString());
                out.close();
                fileOut.close();
                System.out.printf("Serialized data is saved in ./test.txt file");
            } catch (IOException i) {
                i.printStackTrace();
            }
        }
    
        public static List<String> deserialize() {
            String str = null;
            try {
                FileInputStream fileIn = new FileInputStream("./test.txt");
                ObjectInputStream in = new ObjectInputStream(fileIn);
                str = (String) in.readObject();
                in.close();
                fileIn.close();
            } catch (IOException i) {
                i.printStackTrace();
            } catch (ClassNotFoundException c) {
                System.out.println("String class not found");
                c.printStackTrace();
            }
    
            List<String> list = new ArrayList<String>();
    
            if( str == null) {
                return list;
            }
    
            list = Arrays.asList( str.split(" "));
            return list;
        }
    }
    
    public class Assert {
        public static void assertTrue( String input, String output, String expected)
        {
            String msg = "Input:%n%s%n   Output: %s%n Expected: %s%n   Result: %b%n%n";
            System.out.printf(msg, input, output, expected, isEqual(output, expected));
        }
        private static boolean isEqual(String output, String expected) {
            return (output == expected)
                    || (output == null && expected == null)
                    || (output!= null && output.equals(expected))
                    || (expected!= null && expected.equals(output)) ;
        }
    
    }

  • 0

    How about this?

    package Amazon;
    
    import java.util.ArrayList;
    import java.util.List;
    
    public class SerializeString {
    	private String serialize(List<String> input) {
    		StringBuilder sb = new StringBuilder();
    		for(String inp : input) {
    			int len = inp.length();
    			sb.append(len);
    			sb.append(inp);
    		}
    		
    		return sb.toString();
    	}
    
    	private String read(String input, int start, int len) {
    		return input.substring(start, start+len);
    	}
    	
    	private List<String> deSerialize(String input) {
    		List<String> result = new ArrayList<>();
    		
    		char inp[] = input.toCharArray();
    		int inp_len = inp.length;
    		int start = 0;
    		
    		while(start < inp_len) {
    			int len = Character.getNumericValue(inp[start]);
    			
    			result.add(read(input, ++start, len));
    			
    			start = start + len;
    		}
    		
    		return result;
    	}
    
    	public static void main(String[] args) {
    		// TODO Auto-generated method stub
    		SerializeString ss = new SerializeString();
    		List<String> input = new ArrayList<>();
    		input.add("this");
    		input.add(" ");
    		input.add("is");
    		input.add("4");
    		input.add("-");
    		
    		String ser = ss.serialize(input);
    		
    		System.out.println("Serialized : "+ser);
    		System.out.println("De-Serialized : "+ss.deSerialize(ser));
    		
    	}
    
    }
    
    

  • 0
    R

    Your idea of pre-pending length is good but what if an integer is part of the input string then the deserialize logic would go for a toss.

    Below is the approach I had suggested during the interview.

    Create a StringBuilder object to form the resultant serialized string.
    For each string from the inputList, add a delimeter after every string and append it to the result StringBuilder object. Before adding to result also check if the delimeter itself is part of that string or not. If it is part of the string than escape the character with escape character. Same check we need to add for escape character too. After escaping the string append to result.

    So lets say if I take '#' as the delimeter and if the input list is as below

    Input : ["a","b#c/","","d"]
    Consider character '/' as escape character

    For above input list my Serialize function would return "a#b/#c//##d#"

    In deSerialize logic while iterating through the serialized string, if we find '#' then that is the end of the string, add the formed string to result list. If we find the escape character '/' then ignore that and append the next character to the string and continue till you find '#' character


  • 0

    @ratnanireshma said in Serialize a list of strings into a single string and Deserialize it back to original list of strings:

    hen the deserialize logic
    Hi @ratnanireshma , Thanks for pointing that out but my test function also have integer as part of the input list of strings and the logic still seems to work fine. I will really appreciate if you can some counter example where my code fails.


  • 0
    L

    @ratnanireshma

    The code I wrote before works for integer without any changes.

    I modify the input array as below, then run the program.

            List<String> writeList = new ArrayList<String>();
            writeList.add("Th\\i25s");
            writeList.add("");
            writeList.add("is");
            writeList.add("a");
            writeList.add("bo0ok!");
            writeList.add("1");
            writeList.add("-1");
            writeList.add("65");
    

    The output as below. It works, right?

    Serialized data is saved in ./test.txt fileInput:
    Serialize/Deserialize
       Output: [Th\i25s, , is, a, bo0ok!, 1, -1, 65]
     Expected: [Th\i25s, , is, a, bo0ok!, 1, -1, 65]
       Result: true
    

  • 0
    Y

    @laqxs Just tried with this input and got a StringIndexOutOfBoundsException.
    Your code basically does not handle multiple digits lengths.

    input.add("thisierwgtewugtergterite");
    input.add(" ");
    input.add("is");
    input.add("4");
    input.add("-");
    

  • 0
    E

    @ratnanireshma

    This version is written in Python.

    #!/usr/bin/python
    
    import re
    
    def serializeString(lists):
      serialized = ""
      for item in lists:
        serialized += item + "&"
      return serialized
    
    def deserializeString(string):
      deserialized = []
      deserialized = string.rstrip('&').split("&")
      return deserialized
    
    lists = ["a", " ", "bc,", "d"]
    serialized = serializeString(lists)
    print re.sub("&", "", serialized)
    deserialized = deserializeString(serialized)
    print deserialized
    

  • 2

    Using string length is usually good idea to rely on when serializing. Reason is that you rely only on a property of the input string not its actual content, so it's more secure while transmitting after serialization. However it presents problem too.

    @ramanpreetSinghKhinda If we have the input string 12345689123 then using the length as you suggested would make it 12123456789123. You deserialize method won't be able to recognize what is the length of this string.

    @ratnanireshma idea to use delimiter (special character) to separate strings is also good. His example was:
    input : ["a","b#c/","","d"]
    Consider character '/' as escape character
    result: "a#b/#c//##d#"

    In theory this should work in most cases considering careful implementation of the deserialize method, what if the input has #'s in it? We can't simply assume that this is the end of a string because we found # we have to look further incase the input string was something like ### and its serialization would be ####

    But also this method has shortcoming which is you can't represent null strings in it.

    The solution to this problem would be to combine both ideas, for each input string we serialize it in the following way:

    For input string S => Len(S)#S

    For example:

    leetcode     => 8#leetcode
    123456789123 => 12#123456789123
    # => 1##
    #1=>2##1
    ""=>0#
    null could be treated in special way, either we have -1# (length is -1 which indicates null) or we could have 0##
    

    For deserializing, you start by reading the digits till you hit the first # then you have the length of your next string, you keep reading this length to construct the first string, and so on


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.