[GNN+Anomaly Detection] ANEMONE: Graph Anomaly Detection with Multi-Scale Contrastive Learning

Introduction to the paper

Original title : ANEMONE: Graph Anomaly Detection with Multi-Scale Contrastive Learning
Chinese title : Graph Anomaly Detection based on Multi-Scale Contrastive Learning
Publication conference : CIKM
Publication year : 2021-10-26
Author : Ming Jin
latex citation :

@inproceedings{jin2021anemone,
  title={Anemone: Graph anomaly detection with multi-scale contrastive learning},
  author={Jin, Ming and Liu, Yixin and Zheng, Yu and Chi, Lianhua and Li, Yuan-Fang and Pan, Shirui},
  booktitle={Proceedings of the 30th ACM International Conference on Information \& Knowledge Management},
  pages={3122--3126},
  year={2021}
}

Summary

Anomaly detection on graphs plays an important role in network security, e-commerce, financial fraud detection and other fields. However, existing graph anomaly detection methods usually only consider a single graph-scale view, which results in their limited ability to capture anomaly patterns from different perspectives. To this end, we introduce a new graph anomaly detection framework, ANEMONE, to identify anomalies at multiple graph scales simultaneously. Specifically, ANEMONE first utilizes a graph neural network backbone encoder with a multi-scale contrastive learning objective to capture the pattern distribution of graph data by simultaneously learning the agreement between instances at the patch and context levels. Our method then employs a statistical anomaly estimator to evaluate the anomaly of each node based on the degree of agreement across multiple angles. Experiments on three benchmark datasets demonstrate the superiority of this method.

Problems

These methods mainly detect anomalies from a single-scale perspective, ignoring the fact that node anomalies in the graph often occur at different scales.

Paper contribution

  1. A multi-scale contrastive learning framework ANEMONE is proposed for graph anomaly detection, which can capture anomaly patterns at different scales.
  2. A new statistics-based algorithm is designed to estimate node anomalies with the proposed contrast pattern.
  3. Extensive experiments are conducted on three benchmark datasets to demonstrate the superiority of ANEMONE in detecting node-level anomalies on graphs

illustrate

1. ANEMONE framework

Insert image description here
For a selected target node, ANEMONE calculates the node’s anomaly score by leveraging two main components:

  • Multi-scale contrastive learning model : Two gnn-based contrastive networks learn patch-level (i.e., node-to-node) protocol and context-level (i.e., node-to-self network) protocol respectively.
  • Statistical anomaly estimator : Summarizes the patch-level and context-level scores obtained by multiple enhanced self-networks, and calculates the final anomaly score of the target node through statistical estimation. We will introduce these two components in the following sections.
  • Multi-scale contrastive learning model

    Preparation :

    1. Input graph GGG , select a target node
    2. With the target node as the center, use the random walk method to collect two ego-networks (a simple understanding is the two sub-networks traversed with the target node as the center), recorded as G p G_pGpand G c G_cGc(p and c represent patch_level and context-level respectively)
    3. Will G p G_pGpand G c G_cGcThe first node in the node collection is set as the center (target) node.
    4. In order to prevent information leakage in the next contrastive learning step, a preprocessing called target node masking should be performed in the ego network before inputting it into the contrast network. Specifically, the attribute vector of the target node is replaced with a zero vector .

    patch_level comparison network :

    1. Learn G p G_pGpEmbedding and target node vi v_iviEmbedding consistency. G p G_pGpThe embedding is obtained using the GCN model, denoted as H p H_pHp, target node vi v_iviThe embedding is obtained using the MLP model, denoted zp z_pzp
      Insert image description here
      Insert image description here
    • Note that we share the same θ \theta hereθ , in order to map the target node embedding and graph embedding into the same space.
    • Note that when using GCN to obtain graph embedding, the attribute vector of the target node should use a zero vector. When using MLP to obtain node embedding, just use the original attribute vector of the target node.
    1. Use bilinear layers to calculate their similarity scores:
      sp ( i ) = B ilinear ( hp ( i ) , zp ( i ) ) = σ ( hp ( i ) W pzp ( i ) T ) s_p^{(i )} = Bilinear(h_p^{(i)},z_p^{(i)}) = \sigma (h_p^{(i)}W_pz_p^{(i)T})sp(i)=Bilinear(hp(i),zp(i))=s ( hp(i)Wpzp(i)T)
    2. Use negative sampling strategy to train:
      Insert image description here
      Insert image description here

    context_level comparison network:

    1. Learn G c G_cGcEmbedding and target node vi v_iviEmbedding consistency. G p G_pGpThe embedding is obtained using GCN model + readout, denoted as H c H_cHc, target node vi v_iviThe embedding is obtained using the MLP model, denoted zc z_czc
      Insert image description here
      Insert image description here
      z c z_c zcThe calculation method is similar to the patch_level comparison network.
    2. The similarity calculation method is also the same as the patch_level comparison network.
    3. Negative sampling strategy training:
      Insert image description here

    Joint training, the loss function is:
    Insert image description here

  • Statistical anomaly estimator

    After the above calculation, for each node iiI mean, there are 4R fractions, so
    [ sp , 1 ( i ) , . . . , sp , R ( i ) , sc , 1 ( i ) , . . . , sc , R ( i ) , sp , 1 ( ˜ i ) , . . . , sp , R ( ˜ i ) , sc , 1 ( ˜ i ) , . . . , sc , R ( ˜ i ) ] [s_{p,1}^{(i) },...,s_{p,R}^{(i)},s_{c,1}^{(i)},...,s_{c,R}^{(i)},s_ {p,1}^{\~(i)},...,s_{p,R}^{\~(i)},s_{c,1}^{\~(i)},.. .,s_{c,R}^{\~(i)}][sp,1(i),...,sp,R(i),sc,1(i),...,sc,R(i),sp,1(˜i),...,sp,R(˜i),sc,1(˜i),...,sc,R(˜i)]

    Anomalous nodes are assumed to have less consistency with their adjacent structures and contexts. Therefore, we express the basic score as the difference between positive and negative scores:
    Insert image description here
    the view can be p or c

    Statistical methods for anomaly estimation:

    1. Abnormal nodes have relatively large base scores. This is because abnormal node embedding usually has less consistency with graph embedding, which will lead to sview, j (i) s_{view,j}^{(i)}sview,j(i)很小,而sview , j ( ˜ i ) s_{view,j}^{\~(i)}sview,j(˜i)is large, resulting in a large base score.
    2. The basic score of abnormal nodes under multiple self-network sampling is unstable.
    3. Therefore, we will count the anomaly score yp (i) y_p^{(i)}yp(i) y c ( i ) y_c^{(i)} yc(i)Defined as the sum of the mean and standard deviation of the underlying score:
      Insert image description here
      Insert image description here

2. Experiment

  • data set

    We conduct extensive experiments on three well-known citation network datasets, namely Cora, CiteSeer and PubMed.

  • Experimental results

    Insert image description here

Summarize

Paper content

  1. learned methods

    How to write a paper:

    • introduce -> problem statement ->…

Guess you like

Origin blog.csdn.net/Dajian1040556534/article/details/132587546