Solution by ccwei


  • 1
    C

    Approach #1 Sort string [Accepted]

    Intuition

    You get an anagram by rearranging the letters of the original string to produce a string, using all the original letters exactly once. That means the strings also have the same length. For example: "ate", "eat", "tea" are anagrams because they are formed by using three letters 'a', 'e' and 't', and all have string length == 3.

    One way to check if two strings are anagram is by sorting the letters aphabetically in the string and compare the two strings, if they are the same, they are anagrams. For example, if we sort the three strings mentioned above, they all becomes "aet". We can use this property to come up with our first algorithm. By using a dictionary/map, we could use the sorted result as the key, and store the anagrams as a list as the value.

    Algorithm

    Iterate through the input array and sort the letter in the string alphabetically, and then store them in the dictionary/map as anagrams group.

    C#

    public class Solution {
        public IList<IList<string>> GroupAnagrams(string[] strs) {
            // The dictionary to store anagrams group, key is the sorted string and value is the anagram group
            var result = new Dictionary<string, List<string>>();
            for(int i = 0; i < strs.Length; i++) {
                // Sort the string alphabetically, we are using LINQ here in C# to help sorting
                string sortedString = new String(strs[i].OrderBy(c => c).ToArray());
                
                if(result.ContainsKey(sortedString)) {
                    // if the dictionary already have value for the sorted string, add this string into the anagram group
                    result[sortedString].Add(strs[i]);
                } else {
                    // otherwise create new anagram group in the dictionary using sorted string as the key
                    result[sortedString] = new List<string> { strs[i] };
                }
            }
            
            // Convert the dictionary to the correct output format
            IList<IList<string>> r = new List<IList<string>>();
            foreach(var key in result.Keys) {
                r.Add(result[key]);
            }
            
            return r;
        }
    }
    

    Complexity Analysis

    • Time complexity : $$O(n* klogk)$$ where n is the number of strings in the input, and k is the max length of the strings.

      We loop through the input array which take $$O(n)$$ time, for each iteration of the loop, we sort the string, assuming the string length is k, a regular sorting algorithm takes $$O(klogk)$$ time to sort. So the total time for the loop is $$O(n * klogk)$$.

      At the end we use $$O(n)$$ time to convert the dictionary to list.

      So the total time for the algorithm is $$O(n * klogk) + O(n) = O(n * klogk)$$.

    • Space complexity : $$O(n * k)$$. We create a dictionary to store all the strings which takes $$O(n * k)$$ space.

    Approach #2 Counting Sort string [Accepted]

    Algorithm

    Same concept as Approach#1, but we can do better in the sorting step. Normal comparison based sorting algorithms take $$O(nlogn)$$ time, but if we can use counting sort, we could reduce the sorting time to $$O(nlogn)$$. In this case inputs contain only lower case letters, which is perfect for using counting sort.

    For counting sort, we can simply count the number of times the letter occurs, and generate the sorted string by repeating the letter in the alphabetic order.

    C#

    public class Solution {
        public IList<IList<string>> GroupAnagrams(string[] strs) {
            var result = new Dictionary<string, List<string>>();
            for(int i = 0; i < strs.Length; i++) {
                // Sort the string alphabetically, this time using counting sort
                string sortedString = CountingSort(strs[i]);
                
                if(result.ContainsKey(sortedString)) {
                    // if the dictionary already have value for the sorted string, add this string into the anagram group
                    result[sortedString].Add(strs[i]);
                } else {
                    // otherwise create new anagram group in the dictionary using sorted string as the key
                    result[sortedString] = new List<string> { strs[i] };
                }
            }
            
            // Convert the dictionary to the correct output format
            IList<IList<string>> r = new List<IList<string>>();
            foreach(var key in result.Keys) {
                r.Add(result[key]);
            }
            
            return r;
        }
        
        private string CountingSort(string str) {
            // Array to keep count for alphabet 'a' to 'z'
            int[] count = new int[26];
            
            // Loop through the string to count the occurance for the alphabet
            for(int i = 0; i < str.Length; i++)
            {
                count[(int)(str[i]-'a')]++;
            }
            
            var result = new StringBuilder();
            // Loop through 'a' -> 'z' so they are in sort order
            for(int i = 0; i < 26; i++)
            {
                // Repeat the alphabet count[i] times
                for(int j = 0; j < count[i]; j++)
                {
                    result.Append((char)('a' + i));
                }
            }
            return result.ToString();
        }
    }
    

    Complexity Analysis

    • Time complexity : $$O(n* k)$$ where n is the number of strings in the input, and k is the max length of the strings.

      Same as approach #1, we loop through the input array which take $$O(n)$$ time, for each iteration of the loop, we sort the string, assuming the string length is k, this time using counting sort, it only takes $$O(k)$$ time to sort. So the total time for the loop is $$O(n * k)$$.

      At the end we use $$O(n)$$ time to convert the dictionary to list.

      So the total time for the algorithm is $$O(n * k) + O(n) = O(n * k)$$.

    • Space complexity : $$O(n * k)$$. We create a dictionary to store all the strings which takes $$O(n * k)$$ space.

    Approach #3 Encoding string [Accepted]

    Algorithm

    The concept is the similar concept as Approach#1 and #2, but instead of sorting the string alphabetically, if we can somehow convert the string into a representation that can be used to check if they are anagrams, then we don't need to sort the string.

    How do we want to convert the string? The important attributes to check if two strings are anagrams are the letter and the number of times the letter occurs in the string. If they are the same in two strings, then they are anagrams. For example: aabbc and cbaba are anagrams because they both have two a, two b and one c. So if we convert both strings to something like a2b2c1, we can easily see that both strings would get the same converted string.

    The convert process is actually very similar to the Counting Sort step in approach#2, but we don't have to repeat the alphabet count[i] times, we simply print the count.

    C#

    public class Solution {
        public IList<IList<string>> GroupAnagrams(string[] strs) {
            var result = new Dictionary<string, List<string>>();
            for(int i = 0; i < strs.Length; i++) {
                // Convert the string to a new string representation
                string sortedString = ConvertString(strs[i]);
                
                if(result.ContainsKey(sortedString)) {
                    // if the dictionary already have value for the sorted string, add this string into the anagram group
                    result[sortedString].Add(strs[i]);
                } else {
                    // otherwise create new anagram group in the dictionary using sorted string as the key
                    result[sortedString] = new List<string> { strs[i] };
                }
            }
            
            // Convert the dictionary to the correct output format
            IList<IList<string>> r = new List<IList<string>>();
            foreach(var key in result.Keys) {
                r.Add(result[key]);
            }
            
            return r;
        }
        
        private string ConvertString(string str) {
            // Array to keep count for alphabet 'a' to 'z'
            int[] count = new int[26];
            
            // Loop through the string to count the occurance for the alphabet
            for(int i = 0; i < str.Length; i++)
            {
                count[(int)(str[i]-'a')]++;
            }
            
            // Convert the string to alphabet + alphabet count
            var result = new StringBuilder();
            for(int i = 0; i < 26; i++) {
                result.Append((char)('a' + i));
                result.Append(count[i].ToString());
            }
            
            return result.ToString();
        }
    }
    

    Complexity Analysis

    • Time complexity : $$O(n* k)$$ where n is the number of strings in the input, and k is the max length of the strings.

    • Space complexity : $$O(n * k)$$. We create a dictionary to store all the strings which takes $$O(n * k)$$ space.

    The analysis is the same as Approach #2.


  • 0
    M

    Edit: Now it's probably O(k), since it was changed to use StringBuilder.

    Approach #2 is the worst, because your counting sort isn't O(k) but O(k2).


  • 0
    C

    @ManuelP Can you elaborate more?
    The time complexity for the counting sort is O(k), the first loop goes through the string length which is O(k), and the second loop takes O(26 * k) because the total count equals the string length k, therefore O(k) + O(26 * k) = O(k).


  • 1
    M

    @ccwei It's not O(k). Because result += (char)('a' + i); for your string result is not an O(1) operation, not even amortized.

    It's weird that you do use StringBuilder in your third approach, where it doesn't really matter, but don't use it in your second approach, where it does matter.


  • 0
    C

    @ManuelP Ah got it, thanks for pointing it out. I wasn't thinking about the string concat time complexity too much because the implementation could differ for different langues. Changed it to use string builder so it's O(1) operation.


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.