Maybe buggy but generating special hashcode may reduce the time complexity to O(n)


  • 1
    X

    The following program used special hash function to make sure strings with same characters having the same hash. It could avoid sorting characters in a string, thus yield to O(n) but I am concerned the hash function may be weak to be broken if string is super long.

    public class Solution {
        
        public List<String> anagrams(String[] strs) {
            if (strs == null || strs.length == 0) {
                return new ArrayList<String>();
            }
            
            if (strs.length == 1) {
                return new ArrayList<String>();
            }
            List<String> found = new ArrayList<String>();
            Set<Integer> valid = new HashSet<Integer>();
            Map<Integer, String> index = new HashMap<Integer, String>();
            Map<Character, Integer> dict = getDict();
            for (String s : strs) {
                int hash = specialHash(s, dict);
                if (index.containsKey(hash)) {
                    valid.add(hash);
                    found.add(s);
                } else {
                    index.put(hash, s);
                }
            }
            
            for (Integer t : index.keySet()) {
                if (valid.contains(t)) {
                    found.add(((String)index.get(t)));
                }
            }
            
            return found;
        }
        
        private Map<Character, Integer> getDict() {
            Map<Character, Integer> dict = new HashMap<Character, Integer>();
            for (char c = 'a'; c <='z'; c++) {
                dict.put(c, java.util.UUID.randomUUID().hashCode());
            }
            return dict;
        }
        
        private int specialHash(String s, Map<Character, Integer> dict) {
            int hash = 0;
            char[] charArray = s.toCharArray();
            for (int i = 0; i < charArray.length; i++){
                Character c = new Character(charArray[i]);
                if (dict.containsKey(c)) {
                    hash += 31*((int)dict.get(c));
                } else {
                    hash += 31;
                }
            }
            return hash;
        }
    }

  • 0
    S

    Pay attention to "Writing code? Select all code then click on the {} button to preserve code formatting.” above text editor.


  • 0

    I'm using similar idea. Commutative operations like bit operations, addition and multiplications can be applied. But as you said, collisions can happen and hashing-based code will generate wrong answer. I doubt one can use such kind of algorithm during interview.

    btw, here is my code. I use a map to store mapping between (int)hashkey and (vector of int)indices of strings that share the same hash code. Multimap must be a better choice though.

    int hashstr(string s) {
        int add = 0;
        int mult = 1;
        for(int i = 0; i < s.size(); ++i) {
            add = add + ((int)s[i])*((1<<17)-1); mult*=s[i];
        }
        return add^mult;
    }
    
    vector<string> anagrams(vector<string> &strs) {
        if(!strs.size()) return vector<string>();
        unordered_map<int, vector<int>> um;
        for(int i = 0; i < strs.size(); i++) 
            um[hashstr(strs[i])].push_back(i);
        vector<int> ret;
        for(auto p:um) {
            if(p.second.size() <= 1) continue;
            ret.insert(ret.end(), p.second.begin(), p.second.end());
        }
        vector<string> ret2;
        for(auto i:ret) 
            ret2.push_back(strs[i]);
        return ret2;
    }

Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.