LeetCode 839. Similar String Groups

一、题目描述

Two strings X and Y are similar if we can swap two letters (in different positions) of X, so that it equals Y.

For example, "tars" and "rats" are similar (swapping at positions 0 and 2), and "rats" and "arts" are similar, but "star" is not similar to "tars", "rats", or "arts".

Together, these form two connected groups by similarity: {"tars", "rats", "arts"} and {"star"}. Notice that "tars" and "arts" are in the same group even though they are not similar. Formally, each group is such that a word is in the group if and only if it is similar to at least one other word in the group.

We are given a list A of strings. Every string in A is an anagram of every other string in A. How many groups are there?

Example 1:

Input: [“tars”,“rats”,“arts”,“star”]
Output: 2

Note:

  1. A.length <= 2000
  2. A[i].length <= 1000
  3. A.length * A[i].length <= 20000
  4. All words in A consist of lowercase letters only.
  5. All words in A have the same length and are anagrams of each other.
  6. The judging time limit has been increased for this question.

二、题目分析

  题目要求我们为给出的字符串集合中的字符串进行分类,并输出最终分类后的种类数。这样的要求很容易让我们想到并查集这个数据结构,使用并查集能为解决该题提供一个较为清晰的思路。
  依题意,若我们判断两个字符为 similar 的,那么这两个字符串就归为一类,即并查集中的合并(Union)操作。则有简单的伪代码如下:

IF A is similar to B
	Union( A, B )

所以,我们基于并查集要做的,就是对原集合中的每一个字符串都进行一次遍历判断,为了稍微减少搜索次数,我们只需检查某个字符串后面的那些就足够了。


三、具体实现

  如上所述,我们需要实现并查集这个树形结构和判断 similar 的函数。

  • 并查集有两个操作,Find和Union,这里在实现上使用“路径压缩”和“按秩合并”的策略以优化。
  • 判断 similar,只需对比两个字符串是否有两个以上相应字符不同的位置,若没有则它们是 similar 的。

复杂度为 O ( N 2 L ) O(N^2L) ,其中N为集合中字符串的数量,L为字符串的长度。

class Solution
{
  public:
    int numSimilarGroups( vector<string> &A )
    {
        int n = A.size();
        
        num = n;
        sizes = new int[n];
        roots = new int[n];
        
        for ( int i = 0; i < n; ++i ) {
            roots[i] = i;
            sizes[i] = 1;
        }

        for ( int i = 0; i < n; ++i ) {
            for ( int j = i + 1; j < n; ++j ) {
                if ( IsSimilar( A.at( i ), A.at( j ) ) ) {
                    Union( i, j );
                }
            }
        }

        delete [] sizes;
        delete [] roots;

        return num;
    }

  private:

    int num;
    int *sizes = nullptr;
    int *roots = nullptr;

    int FindRoot( int node )
    {
        while ( roots[node] != node ) {

            roots[node] = roots[roots[node]];
            node = roots[node];

        }

        return node;
    }

    void Union( int node_1, int node_2 )
    {
        int root_1 = FindRoot( node_1 ),
            root_2 = FindRoot( node_2 );

        if ( root_1 == root_2 )
            return;

        if ( sizes[root_1] > sizes[root_2] ) {
            roots[root_2] = root_1;
            sizes[root_1] += sizes[root_2];
        } else {
            roots[root_1] = root_2;
            sizes[root_2] += sizes[root_1];
        }

        --num;
    }

    bool IsSimilar( const string &A, const string &B )
    {
        int len = A.size(), cnt = 0;
        for ( int i = 0; i < len; ++i ) {
            if ( A.at( i ) != B.at( i ) )
                ++cnt;

            if ( cnt > 2 )
                return false;
        }

        return true;
    }
};

猜你喜欢

转载自blog.csdn.net/AzureoSky/article/details/82959661