Java solution with hash codes


  • 0
    V
    /**
    If the file content is very large (GB level), how will you modify your solution? 
    --> Find the hashcode of the content.
    can we do better?
    
    If you can only read the file by 1kb each time, how will you modify your solution?
    --> something related to hashcode generation?
    
    What is the time complexity of your modified solution? What is the most time-consuming part and memory consuming part of it? How to optimize?
    
    --> If the hashcodes are given, its O(n) Finding the hashcode of the content.
    
    How to make sure the duplicated files you find are not false positive?
    --> donno.
    **/
    
    class Solution {
        public List<List<String>> findDuplicate(String[] paths) {
            List<List<String>> result = new ArrayList<>();
            
            Map<String, List<String>> m = new HashMap<>();
            for (String path : paths) {
                String[] p = path.split(" ");
                String dir = p[0];
                
                for (int i = 1; i < p.length; i++) {
                    int j = p[i].indexOf('(');
                    if (j != -1) {
                        j++;
                        int k = j;
                        while(k < p[i].length() && p[i].charAt(k) != ')') k++;
                        String content = p[i].substring(j, k);
                        if (!m.containsKey(content)) m.put(content, new ArrayList<>());
                        m.get(content).add(dir + "/" + p[i].substring(0, j-1));
                    }
                }
            }
            
            for (String key : m.keySet()) {
                if (m.get(key).size() > 1) result.add(m.get(key));
            }
            
            return result;
        }
    }
    

Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.