Improve time complexity by mapping each of the 26 English characters to a unique prime number


  • 1

    If we use sort function to check anagram, the time complexity would be O(mnlog(n)) (m means m strings in str and n means n characters in every string)

    So we can check two strings are anagram by mapping them to a product of unique primes, if they are anagram, the product would be the same. Then the time complexity would become O(m*n)

    But there is a problem with this solution. The hash code may be overflow. The best answer post by @StefanPochmann talked about this =)

    The code below is not so clean, just for reference - -

        public class Solution {
        
        private static int[] primes = getPrimes(26);
    
    	public List<List<String>> groupAnagrams(String[] strs) {
    
    		List<List<String>> res = new ArrayList<>();
    
    		if (strs == null || strs.length == 0) {
    			return res;
    		}
    
    		Arrays.sort(strs);
    
    		Map<Long, List<String>> map = new HashMap<>();
    
    		for (String str : strs) {
    
    			long hashCode = getHashcode(str);
    
    			if (map.containsKey(hashCode)) {
    				map.get(hashCode).add(str);
    			} else {
    				map.put(hashCode, new ArrayList<String>());
    				map.get(hashCode).add(str);
    			}
    
    		}
    
    		for (List<String> list : map.values()) {
    			res.add(list);
    		}
    
    		return res;
    
    	}
    	
    	private long getHashcode(String s) {
    
    		long res = 1;
    
    		for (int i = 0; i < s.length(); i++) {
    			res *= primes[s.charAt(i) - 'a'];
    		}
    
    		return res;
    
    	}
    
    	private static int[] getPrimes(int n) {
    
    		int[] res = new int[n];
    
    		boolean[] notPrime = new boolean[n * n];
    
    		int count = 0;
    
    		for (int i = 2; count < n && i < n * n; i++) {
    
    			if (!notPrime[i]) {
    			    
    				res[count] = i;
    
    				count++;
    
    				for (int j = 2; i * j < n * n; j++) {
    					notPrime[i * j] = true;
    				}
    			}
    		}
    
    		return res;
    
    	}
    }

  • 0
    This post is deleted!

  • 1

    It's wrong, for example you fail this input:

    ["aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa","notananagramofthefirststringaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"]
    

    Your code falsely claims they're anagrams.


  • 0

    Yes, you are right, because it is overflowed, I updated my answer with this. However it passed all the test cases - - Thank you!


  • 0

    Finding a counter-example would btw be a lot harder if you didn't include the number 2. I've tried a few things but haven't found one yet.


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.