Junk-Mail Filter@HDU 2473

Recognizing junk mails is a tough task. The method used here consists of two steps:
1) Extract the common characteristics from the incoming email.
2) Use a filter matching the set of common characteristics extracted to determine whether the email is a spam.

We want to extract the set of common characteristics from the N sample junk emails available at the moment, and thus having a handy data-analyzing tool would be helpful. The tool should support the following kinds of operations:

a) “M X Y”, meaning that we think that the characteristics of spam X and Y are the same. Note that the relationship defined here is transitive, so
relationships (other than the one between X and Y) need to be created if they are not present at the moment.

b) “S X”, meaning that we think spam X had been misidentified. Your tool should remove all relationships that spam X has when this command is received; after that, spam X will become an isolated node in the relationship graph.

Initially no relationships exist between any pair of the junk emails, so the number of distinct characteristics at that time is N.
Please help us keep track of any necessary information to solve our problem.

Input

There are multiple test cases in the input file.
Each test case starts with two integers, N and M (1 ≤ N ≤ 10 5 , 1 ≤ M ≤ 10 6), the number of email samples and the number of operations. M lines follow, each line is one of the two formats described above.
Two successive test cases are separated by a blank line. A case with N = 0 and M = 0 indicates the end of the input file, and should not be processed by your program.

Output

For each test case, please print a single integer, the number of distinct common characteristics, to the console. Follow the format as indicated in the sample below.

Sample Input
5 6
M 0 1
M 1 2
M 1 3
S 1
M 1 2
S 3

3 1
M 1 2

0 0
Sample Output
Case #1: 3
Case #2: 2

题目分析：并查集删点问题。

并查集删点需要定义一个虚父节点。当从一个集合中删点时，我们最容易想到的办法就是直接改变这个点的父节点。但这个方法不能保证万无一失。假设集合{1，2，3}的父节点都是 1，当删除1时，只把1的父节点改完其实跟没改是一样的（压缩路径时又会将这个集合挂到我们改完的父节点上）。其实根本原因是1这个节点会在题目中出现，我们会用到，所以才会出现牵一发动全身的影响。

所以虚父节点应该是不会在题目的点集中出现的节点，比如题目要求 1<= i <= 3,但我们一开始就将1的父节点初始化为4，那么{1，2，3}的父节点就是4，当删除1时，我们只要将1的父节点改为另一个虚拟值（比如5）就解决了之前的问题。

具体情况还是要具体分析。

扫描二维码关注公众号，回复： 2598248 查看本文章

#include <algorithm>
#include <iostream>
#include <string>
#include <vector>
#include <cstring>
#include <cstdio>
#include <stack>
#include <set>
#define MAXN 2000003
using namespace std;

int f[MAXN];
int vf;
set<int> s;

int getf(int a){
    int x = a;
    while(f[x] != x){
        x = f[x];
    }
    while(f[a] != x){
        int j = a;
        a = f[a];
        f[j] = x;
    }
    return x;
}

void Union(int a,int b){
    int fa = getf(a);
    int fb = getf(b);
    if(fa != fb) f[fb] = fa;
}

void Del(int a){
    f[a] = vf++;
}

int main(){
    int n,m;
    int cases = 1;
    while(~scanf("%d%d",&n,&m) && (n+m)){
        getchar();
        vf = 2*n;
        for(int i=0;i<n;++i) f[i] = n+i;
        for(int i=n;i<=2*n+m;++i) f[i] = i;
        s.clear();
        char op;
        for(int i=0;i<m;++i){
            scanf("%c",&op);
            if(op == 'M'){
                int a,b;
                scanf("%d%d",&a,&b);
                getchar();
                Union(a,b);
            }
            else{
                int a;
                scanf("%d",&a);
                getchar();
                Del(a);
            }
        }
        /*for(int i=0;i<n;++i) cout << f[i] << ' ';
        cout << endl;*/
        int ans = 0;
        for(int i=0;i<n;++i){
            s.insert(getf(i));
        }
        printf("Case #%d: %d\n",cases++,s.size());
    }
    return 0;
}

Junk-Mail Filter@HDU 2473

猜你喜欢