Super simple O(n) Bucket Sort based Java solution (11 ms). No fancy Data structure needed. Beats 96%.


  • 14
    P

    Could not find a simpler way to do this. I see people are using HashMap/TreeMap which are not at all required. If you know bucket sort then following solution will be easy to understand!

    public String frequencySort(String s) {
            if(s.length() < 3)
                return s;
            int max = 0;
            int[] map = new int[256];
            for(char ch : s.toCharArray()) {
                map[ch]++;
                max = Math.max(max,map[ch]);
            }
            String[] buckets = new String[max + 1]; // create max buckets
            for(int i = 0 ; i < 256; i++) { // join chars in the same bucket
                String str = buckets[map[i]];
                if(map[i] > 0)
                    buckets[map[i]] = (str == null) ? "" + (char)i : (str + (char) i);
            }
            StringBuilder strb = new StringBuilder();
            for(int i = max; i >= 0; i--) { // create string for each bucket.
                if(buckets[i] != null)
                    for(char ch : buckets[i].toCharArray())
                        for(int j = 0; j < i; j++)
                            strb.append(ch);
            }
            return strb.toString();
        }
    

  • 1
    M

    @piyush121 The following loop is unnecessary.

    for(int num : map) // find max.
                if(num > max)
                    max = num;
    

    max can be calculated in the previous loop as follows :

    for(char ch : s.toCharArray()) 
                map[ch]++;
                max = Math.max(max,map[ch]);
    }
    
    

  • 1
    N

    Looks like it is not in O(n) in the worst case. For example, consider input cases with heavy repetition, such as: "teeeeeeeeeeee... e. till n times". your algorithm runs O(nkn) [from last loop] which is roughly O(n^2), since k is constant or significantly less in these type of input cases. please correct me if I am wrong.


  • 1
    P

    @mohan.s.upadhyay Yes, You're right. Just updated the code.


  • 0
    K

    I think an array of size 128 is enough, as I used an array of that size and got approved.


  • 0
    P

    @Kongo Yes, I think you're right.


  • 7

    @piyush121 another thing to consider here is the space complexity when your frequency distribution is almost bimodal; as @nikprashanth pointed out earlier, if you have a string with a single character with a high frequency (ex: 1 million) and the rest of the characters in the single digits, you're instantiating an array (for your buckets) w/ size 1 million. It is in situations like these that the heap-based solutions are space-bound by the number of unique characters (as opposed to their counts which arguably has a higher upper bound). Naturally, the bucket-sort solution only works well when the range of characters is constrained (ex: 7-8bit ASCII) but croaks when supplied UTF-8/UTF-16 strings.


  • 0
    P

    @schumpeter Totally agree.


  • 0
    B

    Another thing which is non linear is when you append the characters in the frequency array. String concatenation with + takes quadratic time. It's not too bad here because the number of distinct characters is not too high (128, or maybe 26).


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.