难度: Hard
原题连接
内容描述
Two strings X and Y are similar if we can swap two letters (in different positions) of X, so that it equals Y.
For example, "tars" and "rats" are similar (swapping at positions 0 and 2), and "rats" and "arts" are similar, but "star" is not similar to "tars", "rats", or "arts".
Together, these form two connected groups by similarity: {"tars", "rats", "arts"} and {"star"}. Notice that "tars" and "arts" are in the same group even though they are not similar. Formally, each group is such that a word is in the group if and only if it is similar to at least one other word in the group.
We are given a list A of strings. Every string in A is an anagram of every other string in A. How many groups are there?
Example 1:
Input: ["tars","rats","arts","star"]
Output: 2
Note:
A.length <= 2000
A[i].length <= 1000
A.length * A[i].length <= 20000
All words in A consist of lowercase letters only.
All words in A have the same length and are anagrams of each other.
The judging time limit has been increased for this question.
思路 1 - 时间复杂度: O(N^2 * W)- 空间复杂度: O(N^2)******
开始没有看到这句话All words in A have the same length and are anagrams of each other.
, 导致isSimilar函数写的很繁琐
普通的union-find,超时,原因是当我们的A很长,但是word都很短的时候,我们没有必要两两word比较一下,因为A.length * A[i].length <= 20000
,
假设说我们的A长度为10000,可能就会超时了,因为双重循环里面还要check similar,然后可能还要union
时间复杂度中的W为A中每个word的长度
class UnionFind(object):
def __init__(self, n): # 初始化uf数组和组数目
self.count = n
self.uf = [i for i in range(n)]
def find(self, x): # 判断节点所属于的组
while x != self.uf[x]:
self.uf[x] = self.uf[self.uf[x]]
x = self.uf[x]
return self.uf[x]
def union(self, x, y): # 连接两个节点
x_root = self.find(x)
y_root = self.find(y)
if x_root == y_root:
return
self.uf[y_root] = x_root
self.count -= 1
def getCount(self): # 返回所有组的数目
return self.count
class Solution:
def numSimilarGroups(self, A):
"""
:type A: List[str]
:rtype: int
"""
uf = UnionFind(len(A))
def isSimilar(s1, s2):
tmp = []
for i in range(len(s1)):
if s1[i] != s2[i]:
tmp.append(s1[i])
tmp.append(s2[i])
if len(tmp) != 4:
return False
return tmp[0] == tmp[-1] and tmp[1] == tmp[2]
for (i1, word1), (i2, word2) in \
itertools.combinations(enumerate(A), 2):
if isSimilar(word1, word2):
uf.union(i1, i2)
return uf.getCount()
思路 2 - 时间复杂度: O(N^2 * W) or O(N * W^3)- 空间复杂度: O(N^2)******
吸取了上面思路的教训,我们分两种方法来解决,
- 当
len(A) < len(A[0]) ** 2
的时候,即A比较短,采用brute force on every two word. O(N^2 * W) - check all possible similar words. O(N * W^3)
这个解法在python2可以过,python3过不了,无语。。
class UnionFind(object):
def __init__(self, n): # 初始化uf数组和组数目
self.count = n
self.uf = [i for i in range(n)]
def find(self, x): # 判断节点所属于的组
while x != self.uf[x]:
self.uf[x] = self.uf[self.uf[x]]
x = self.uf[x]
return self.uf[x]
def union(self, x, y): # 连接两个节点
x_root = self.find(x)
y_root = self.find(y)
if x_root == y_root:
return
self.uf[y_root] = x_root
self.count -= 1
def getCount(self): # 返回所有组的数目
return self.count
class Solution:
def numSimilarGroups(self, A):
"""
:type A: List[str]
:rtype: int
"""
uf = UnionFind(len(A))
def isSimilar(s1, s2):
return sum(i != j for i, j in zip(s1, s2)) == 2
lookup = {word: idx for idx, word in enumerate(A)}
if len(A) < len(A[0]) ** 2: # short A
for (i, w1), (j, w2) in itertools.combinations(enumerate(A), 2):
if isSimilar(w1, w2):
uf.union(i, j)
else: # short word
for idx, w1 in enumerate(A):
for i, j in itertools.combinations(range(len(A[0])), 2):
w2 = w1[:i] + w1[j] + w1[i+1:j] + w1[i] + w1[j+1:]
if w2 in lookup:
uf.union(idx, lookup[w2])
return uf.getCount()
思路 3 - 时间复杂度: O(N^2 * W) or O(N * W^3)- 空间复杂度: O(N^2)******
我们换一种union-find, 用字典实现的,然后我们直接union的就是word本身,参考寒神
这里有一个很奇怪的case,第59个test case是这样的:["aaaaaaa", "aaaaaaa", "aaaaaaa"..."aaaaaaa", "aaaaaaa", "aaaaaaa"],
虽然我觉得很困惑,因为anagram的定义是:
An anagram is a word or phrase formed by rearranging the letters of a different word or phrase,
typically using all the original letters exactly once.
也就是说所有的word都必须different才行,
但是这里为了AC没办法了,
- 只能在第一行加上一个
A = list(set(A))
, - 或者不加
A = list(set(A))
也可以,但是要把self.count = len(A)
改成self.count = len(uf)
beats 24.32%
class Solution:
def numSimilarGroups(self, A):
"""
:type A: List[str]
:rtype: int
"""
A = list(set(A))
uf = {word: word for word in A}
self.count = len(A)
def find(x):
if x != uf[x]:
uf[x] = find(uf[x])
return uf[x]
def union(x, y):
x_root = find(x)
y_root = find(y)
if x_root == y_root:
return
uf[x_root] = y_root
self.count -= 1
def isSimilar(s1, s2):
return sum(i != j for i, j in zip(s1, s2)) == 2
if len(A) < len(A[0]): # short A
for w1, w2 in itertools.combinations(A, 2):
if isSimilar(w1, w2):
union(w1, w2)
else: # short word
for w1 in A:
for i, j in itertools.combinations(range(len(A[0])), 2):
w2 = w1[:i] + w1[j] + w1[i+1:j] + w1[i] + w1[j+1:]
if w2 in uf:
union(w1, w2)
return self.count