一、题目描述
Two strings X
and Y
are similar if we can swap two letters (in different positions) of X
, so that it equals Y
.
For example, "tars"
and "rats"
are similar (swapping at positions 0
and 2
), and "rats"
and "arts"
are similar, but "star"
is not similar to "tars"
, "rats"
, or "arts"
.
Together, these form two connected groups by similarity: {"tars", "rats", "arts"}
and {"star"}
. Notice that "tars"
and "arts"
are in the same group even though they are not similar. Formally, each group is such that a word is in the group if and only if it is similar to at least one other word in the group.
We are given a list A
of strings. Every string in A
is an anagram of every other string in A
. How many groups are there?
Example 1:
Input: [“tars”,“rats”,“arts”,“star”]
Output: 2
Note:
A.length <= 2000
A[i].length <= 1000
A.length * A[i].length <= 20000
- All words in
A
consist of lowercase letters only. - All words in
A
have the same length and are anagrams of each other. - The judging time limit has been increased for this question.
二、题目分析
题目要求我们为给出的字符串集合中的字符串进行分类,并输出最终分类后的种类数。这样的要求很容易让我们想到并查集这个数据结构,使用并查集能为解决该题提供一个较为清晰的思路。
依题意,若我们判断两个字符为 similar 的,那么这两个字符串就归为一类,即并查集中的合并(Union)操作。则有简单的伪代码如下:
IF A is similar to B
Union( A, B )
所以,我们基于并查集要做的,就是对原集合中的每一个字符串都进行一次遍历判断,为了稍微减少搜索次数,我们只需检查某个字符串后面的那些就足够了。
三、具体实现
如上所述,我们需要实现并查集这个树形结构和判断 similar 的函数。
- 并查集有两个操作,Find和Union,这里在实现上使用“路径压缩”和“按秩合并”的策略以优化。
- 判断 similar,只需对比两个字符串是否有两个以上相应字符不同的位置,若没有则它们是 similar 的。
复杂度为 ,其中N为集合中字符串的数量,L为字符串的长度。
class Solution
{
public:
int numSimilarGroups( vector<string> &A )
{
int n = A.size();
num = n;
sizes = new int[n];
roots = new int[n];
for ( int i = 0; i < n; ++i ) {
roots[i] = i;
sizes[i] = 1;
}
for ( int i = 0; i < n; ++i ) {
for ( int j = i + 1; j < n; ++j ) {
if ( IsSimilar( A.at( i ), A.at( j ) ) ) {
Union( i, j );
}
}
}
delete [] sizes;
delete [] roots;
return num;
}
private:
int num;
int *sizes = nullptr;
int *roots = nullptr;
int FindRoot( int node )
{
while ( roots[node] != node ) {
roots[node] = roots[roots[node]];
node = roots[node];
}
return node;
}
void Union( int node_1, int node_2 )
{
int root_1 = FindRoot( node_1 ),
root_2 = FindRoot( node_2 );
if ( root_1 == root_2 )
return;
if ( sizes[root_1] > sizes[root_2] ) {
roots[root_2] = root_1;
sizes[root_1] += sizes[root_2];
} else {
roots[root_1] = root_2;
sizes[root_2] += sizes[root_1];
}
--num;
}
bool IsSimilar( const string &A, const string &B )
{
int len = A.size(), cnt = 0;
for ( int i = 0; i < len; ++i ) {
if ( A.at( i ) != B.at( i ) )
++cnt;
if ( cnt > 2 )
return false;
}
return true;
}
};