Sentence Similarity II https://leetcode.com/problems/sentence-similarity-ii/description/
Use DFS to identify similar words
- Build a graph of words. Model graph as a dictionary with key as word and value as a set of neighbors. Build the graph by iterating over the list of pairs and adding both (a,b) and (b,a) into the graph for every pair [a,b].
- To create a cluster of similar words, initialize a w_map dictionary where key is a word and value is an integer which is the cluster id. Randomly pick any word as the root, and do DFS using this word as the root. Mark all the words visited in this traversal as members of the same component (i.e same component id) and add the mapping word:component_id to the w_map dictionary. Continue performing DFS for every node/word in the graph as long as it has not been marked already (i.e. not in w_map).
- To test similarity, 2 words w1 and w2 are similar if they both are in the same component.
from collections import defaultdict class Solution: def build_graph(self, pairs): graph = defaultdict(set) for pair in pairs: first, second = pair, pair graph[first].add(second) graph[second].add(first) return graph def dfs(self, graph, root, cid, w_map): w_map[root] = cid for nbr in graph[root]: if nbr not in w_map: self.dfs(graph, nbr, cid, w_map) return def cluster_words(self, graph): w_map = defaultdict(int) cid = 1 for word in graph: if word not in w_map: self.dfs(graph, word, cid, w_map) cid += 1 return w_map def areSentencesSimilarTwo(self, words1, words2, pairs): """ :type words1: List[str] :type words2: List[str] :type pairs: List[List[str]] :rtype: bool """ graph = self.build_graph(pairs) w_map = self.cluster_words(graph) if len(words1) != len(words2): return False for w1, w2 in zip(words1, words2): if w1 != w2 and (w1 not in w_map or w2 not in w_map or w_map[w1] != w_map[w2]): break else: return True return False