Federated Learning Paper (+1): Threats to Federated Learning: A Survey

Threats to Federated Learning: A Survey

This article is a paper published on arXiv in 2020. It introduces the privacy threat of federated learning. It is a good review paper, which is very helpful for understanding poisoning attacks and inference attacks. The following is my personal opinion The key records of the paper may also have misunderstandings. It is recommended to read the original text!

1. Background and introduction

As can be seen from the abstract, federated learning may be a solution due to the strong privacy challenges faced by traditional machine learning model centralized training methods. However, existing federated learning protocols have demonstrated vulnerabilities that attackers can exploit to compromise data privacy. Therefore, the impact of future federated learning algorithm design on privacy protection is crucial. This article mainly introduces the concept of federated learning, and then provides a unique taxonomy that covers the two main attacks of threat models and federated learning:poisoning attackandinference attack. And at the end, future research directions are discussed.

According to data characteristics and the distribution of data samples among participants, federated learning is divided into horizontal federated learning (HFL), vertical federated learning (VFL) and federated transfer learning (FTL). Further, according to the number of participants, participation level and technical ability of federated learning training, HFL is further divided into HFL to business (H2B) and HFL to consumer (H2C).
insert image description here

  • The H2B model, with a small number of participants, is often selected, and the participants have significant computing power and sophisticated technical abilities.
  • H2C mode, thousands of participants, only a subset of them is selected for each round of training, and the computing power is generally not high.

1.2 Privacy leakage in federated learning

  • Communicating model updates during training can still reveal sensitive information.
  • Even a small fraction of the original gradient may reveal information about the local data.
  • A malicious attacker can safely steal the training data in gradients in a few iterations.
    The federated learning protocol design may contain two vulnerabilities:
  • (1) Potentially malicious servers that can observe individual updates over time, tamper with the training process, and control participants' views of global parameters;
  • (2) Any participant who can observe global parameters and control the upload of their parameters. For example, malicious actors intentionally alter inputs, or introduce covert backdoors into the global model.

This paper investigates recent developments in threats to federated learning, focusing only on two specific threats to federated learning systems from within:

  • (1) Poisoning attacks that attempt to prevent a model from being learned, or make the model produce more desirable inferences than an adversary;
  • (2) Inference attacks the privacy of the participants.

2. Threat Model

2.1 Internal and external

  • Insider attacks include: Federated Learning Server, Federated Learning Participants.
  • External attacks include: eavesdroppers launching attacks between participants and servers, and user-initiated attacks when the final federated learning model is deployed as a server.

An inside attack is usually stronger than an outside attack because it strictly enhances the adversary's capabilities. This article focuses on insider attacks, which can take one of three forms:

  • 1) Single attack: A single, non-colluding malicious actor aims to cause the model to misclassify a selected set of high-confidence inputs;
  • 2) Byzantine attack: Byzantine malicious actors may behave completely arbitrarily, adjusting their outputs to have a similar distribution to correct model updates, making them difficult to detect.
  • 3) Sybil attack: The adversary can impersonate participant accounts, or select previously compromised participants to launch more powerful attacks on federated learning.

2.2 Half-hearted and malicious

  • In a semi-real environment, the adversary is considered passive or honest and curious. They try to learn about the private states of other parties without deviating from the federated learning protocol. It is assumed that the passive adversary only observes the aggregated or averaged gradients and not the training data and gradients of other honest players.
  • In a malicious environment, aggressive or malicious adversaries try to learn the private state of honest participants and arbitrarily deviate from the federated learning protocol by modifying, replaying, or deleting messages. This powerful adversary model allows adversaries to conduct particularly destructive attacks.

2.3 Training Phase and Inference Phase

  • Attacks during the training phase that attempt to learn, influence or disrupt the federated learning model itself. During the training phase, attackers can perform data poisoning attacks to destroy the integrity of the training dataset, or model poisoning attacks to destroy the integrity of the learning process. An attacker can also launch a series of inference attacks on the updates of a single participant or on the aggregation of updates from all participants.
  • Attacks in the inference phase, known as evasion/exploitation attacks, typically do not tamper with the target model, but instead cause it to produce false outputs (target/non-target), or gather evidence about model features. The effectiveness of such attacks depends heavily on the information about the model available to the adversary. Inference attacks are divided into white-box attacks (i.e. full access to the federated learning model) and black-box attacks (i.e. only the federated learning model can be queried).

3. Poisoning attack

According to the target of the attack, poisoning attacks can be divided into random attacks and targeted attacks .

  • Random attacks: Reduce the accuracy of federated learning models.
  • Targeted attack: Induce the federated learning model to output the target label specified by the adversary.

Targeted attacks are more difficult than random attacks because the attacker has a specific goal to achieve. During the training phase, poisoning attacks can be performed on the data and model.
insert image description here

  • 1) Data poisoning : Data poisoning attacks in the process of local data collection are divided into two categories

    • clean-label: Assume that the adversary cannot change the labels of any training data, because of this process, the data is certified as belonging to the correct class, and data sample poisoning must be imperceptible.
    • dirty-label: The adversary can introduce into the training set a large number of data samples misclassified by its desired target label. A common example is the label flipping attack. The labels of honest training examples of one class are flipped to another class, while the features of the data remain the same.
      Any FL participant can conduct a data poisoning attack. The impact on the FL model depends on the degree to which participants in the system participate in the attack, as well as the amount of poisoned training data.
  • 2) Model Poisoning : Model poisoning attacks during local model training. The purpose of model poisoning attacks is to poison local model updates before they are sent to the server, or to insert hidden backdoors in the global model. In target model poisoning, the adversary's goal is to cause the FL model to misclassify a set of selected high-confidence inputs. Model poisoning attacks are shown to be more effective than data poisoning in the FL setting by analyzing a targeted model poisoning attack, in which a single, non-compliant malicious actor plays a role in causing a model to have high confidence in a set of choices degree input for misclassification.

Both of these poisoning attacks attempt to modify the behavior of the target model in some undesirable way. If adversaries can compromise FL servers, they can easily perform targeted and untargeted poisoning attacks on trained models.

4. Reasoning attack

Swapping gradients during FL training can lead to serious privacy leakage issues, and model updates may reveal additional information about unintended features of participants' training data to adversary participants.
insert image description here
The adversary can also keep snapshots of the FL model parameters and perform attribute inference using the difference between consecutive snapshots, which is equal to the aggregated updates of all participants minus the adversary’s updates.
insert image description here

The main reason is that the gradients are derived from the private data of the participants. Similarly, for convolutional layers, the gradient of the weights is the convolution of the error and features from the previous layer. Observations of model updates can be used to infer a wealth of private information, such as class representation, membership, and attributes related to subsets of the training data. To make matters worse, the attacker can infer the labels from the shared gradients and recover the original training samples without any prior knowledge about the training.

4.1 Representative of the Reasoning Class

Hitaj et al. designed an active inference attack called Generative Adversarial Networks (GAN) to attack deep FL models. Here, malicious actors can intentionally compromise any other actor. GAN attacks exploit the real-time nature of the FL learning process, allowing adversaries to train GANs to generate prototype samples of the target training data, which should be private. The generated samples appear to be from the same distribution as the training data. Therefore, the goal of a GAN attack is not to reconstruct the actual training input, but rather the class representation. GAN attacks assume that the entire training corpus for a given class comes from a single participant, only in the special case where all class members are similar, the GAN constructed representation is similar to the training data, and GAN attacks are not well suited for H2C scenarios as it requires a large number of computing resources.

4.2 Inference members

The purpose of the membership inference attack is to determine if it is used to model. For example, an attacker could infer whether a particular patient profile was used to train a disease-related classifier. In FL, the adversary's goal is to infer whether a particular sample belongs to a single party (if the target update is a single party) or private training data of any party (if the target update is an aggregate).
Attackers in FL systems can conduct active and passive membership inference attacks.

  • In the passive case, the attacker only needs to observe the updated model parameters and make inferences without changing anything in the local or global co-training process.
  • In the active case, an attacker can tamper with the FL model training protocol and conduct more powerful attacks against other participants. Specifically, the attacker shares malicious updates and forces the FL model to share more information about the local data of the actors of interest to the attacker. This attack, called a gradient ascent attack, exploits the fact that the SGD optimization updates the model parameters in the opposite direction to the loss gradient.

4.3 Inference properties

Adversaries can launch passive and active attribute inference attacks to infer attributes of other participants' training data that are independent of the class features of the FL model. Attribute inference attacks assume that the adversary has auxiliary training data that correctly labels the attributes he wants to infer. A passive adversary can only observe/eavesdrop on updates and perform inference by training a binary attribute classifier. Adversarial actors can even infer when a property appears or disappears in the data during training.

4.4 Inference training inputs and labels

The recent work "Deep Leakage of Gradients" (DLG) by Zhu2019 et al. proposes an optimization algorithm that can obtain training inputs and labels in just a few generations of iterations. This attack is much stronger than previous methods. It recovers pixel-accurate raw images and marker-level matched raw text. Inference attacks usually assume that the adversary possesses sophisticated technical capabilities and large computational resources. Furthermore, adversaries must be selected for multiple FL training sessions. Therefore, it is not suitable for H2C scenario, but more likely in H2B scenario. Such attacks also highlight the need to protect shared gradients during FL training, possibly through mechanisms such as homomorphic encryption.

V. Discussion and future development direction

To improve the robustness of the FL system, there are still some potential loopholes that need to be addressed. Below the authors outline research directions that are considered promising.

5.1 The curse of dimensionality

Large models with high-latitude parameter vectors are particularly vulnerable to privacy and security attacks. Most FL algorithms require overwriting local model parameters with global models. This makes them vulnerable to poisoning and backdoor attacks, as adversaries can make small but damaging changes in high-dimensional models without being detected. To address these fundamental shortcomings of FL, it is worth exploring whether shared model updates are necessary. Conversely, sharing less sensitive information such as SIGNSGD or only model predictions in a black-box fashion may lead to stronger privacy protection in FL.

5.2 Threats to the VFL

In VFL, only one party may have the label for a given learning task. It is unclear whether all actors are equally capable of attacking FL models, and whether threats to HFL can work against VFL. Most of the current threats are still centered on the HFL. Therefore, the threat to the VFL is worth exploring.

5.3 FL with Heterogeneous Architectures

Shared model updates are usually limited to homogeneous FL architectures, where the same model is shared with all participants. It will be interesting to investigate how FL can be extended to collaboratively trained models for heterogeneous architectures, and whether existing attack and privacy techniques can be adapted to this paradigm.

5.4 Decentralized Federated Learning

Currently working on decentralized federated learning that does not require a single server. This is a potential learning framework for collaboration between businesses that do not trust any third party. In this scenario, each participant can be elected as the server in turn. In addition, it may present new attack problems, such as the fact that the party finally elected as the server chooses to insert a backdoor, which may contaminate the entire model.

5.5 Current Defense Vulnerability

Federated learning with secure aggregation is particularly vulnerable to poisoning attacks due to the inability to detect individual updates. It is unclear whether adversarial training is suitable for federated learning, since adversarial training is mainly developed for IID data, how it performs in non-IID settings remains a challenging problem.

5.6 Optimizing the Deployment of Defense Mechanisms

When deploying a defense mechanism to detect whether an adversary attacks the federated learning system, the federated learning server will require additional computational overhead. It is of great significance to study how to optimize the deployment of defense mechanisms.

Guess you like

Origin blog.csdn.net/A33280000f/article/details/123049926