What is federated learning? Introduction to federated learning

1 Background of federated learning

In most cases, data is scattered in the hands of various companies, and each company hopes to join other companies (using the data held by each company) to train a model without disclosing their own data. This model can help Enterprises gain greater benefits.

For collecting scattered data, the traditional method is to build a data center and train the model in the data center. However, with more and more legal restrictions and the reluctance of data owners to disclose their data, this method has become It gradually stopped working.

2 Introduction to federated learning

Definition: Federated learning (federated machine learning), the full name of federated machine learning, is a method proposed in order to solve the privacy problem when jointly training models: let each enterprise train the model on its own, and each enterprise will complete the model. After training, upload the parameters of each model to a central server (it can also be point-to-point). The central server combines the parameters of each enterprise (it can upload gradients or its own updated parameters) and re-formulate new parameters (for example, through Weighted average (this step is called federated aggregation), the new parameters are distributed to each enterprise, and the enterprise deploys the new parameters to the model to continue new training. This process can be iterated repeatedly until the model converges, or other conditions are met. conditions of.
[The external link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-H8FpL0PN-1668479217904) (Federated Learning)].assets/image-20221114144201902.png)
Insert image description here
The effect of using a federated learning model will be worse than directly pooling the data for training. Such effect loss (but in actual situations may not be a loss, data loss may be similar to regularization, the effect may be improved) in exchange for privacy protection is acceptable.

3 Classification of Federated Learning

In order to better understand the classification of federated learning, first define the data:

The data owned by each enterprise can be regarded as a table. Each row of the table is a sample with multiple features and labels. Each column is a feature or label. For example, the following figure can be a sample of a certain enterprise. Compiled house price data from various places:
Insert image description here
horizontal federated learning, vertical federated learning, and federated transfer learning are classified based on the similarity of the data of each participant, while the focus of federated reinforcement learning is to make decisions (take actions) based on the environment of each party.

3.1 Horizontal federated learning

Horizontal Federated Learning means that the features of the data owned by each party are basically the same and have their own labels. If the data of each party is concentrated into a central body, each party has are different samples of the centrosome (horizontally refers to the horizontal division of the centrosome).
Insert image description here
The process of horizontal federated learning (with a central server) can be summarized as follows:
Insert image description here

3.2 Vertical Federated Learning

Vertical Federated Learning means that the data owned by each party has many different characteristics, but there may be many identical sample individuals (such as the same person's information in banks and insurance companies), and the data is also concentrated into one In the case of the central body, each party has different attributes of the sample (vertical refers to the vertical division of the central body.
Insert image description here
Vertical federated learning actually only has one party with labels, and the intersecting data is trained through vertical federated learning.

The process of vertical federated learning is slightly more complicated. The first thing that needs to be done is the alignment of the data. Since the data cannot be leaked, the encrypted data is aligned, as shown in the figure below: The aligned data is roughly as follows (the operation on it will only
Insert image description here
be Obtain the intersecting samples in the two data, namely U1 and U2 in the figure below:
Insert image description here
Since only one party has the label, but both parties (taking two participants as an example) have made predictions, the loss function is redefined here:
L = 1 2 ∑ ( y A ( i ) + y B ( i ) − y ) 2 L=\frac 1 2 \sum(y_A^{(i)}+y_B^{(i)}-y)^2L=21(yA(i)+yB(i)y)2.
Each model only has parameters related to its own characteristics. Taking the above example as an example, that is:
y A ( U 1 ) = w 3 X 3 + w 4 X 4 + w 5 X 5 y B ( U 1 ) = w 1 X 1 + w 2yA(U1)=w3X3+w4X4+w5X5yB(U1)=w1X1+w2X2
This is just a simple example. It is actually done in the form of a matrix and contains multiple hidden layers.

This means that both during training and prediction, participants need to coordinate and exchange data (participants need to calculate their own prediction values ​​and gradients and send the results to the central server for aggregation).

3.3 Federated transfer learning

Federated Transfer Learning refers to a situation where there is little overlap in the data of the participants (the people who generate the data are different, and the characteristics of the data are also very different). If the data questions are integrated into a central body, there will be a large number of The position is blank information, while each participant approximately owns an independent partition of the central body.
Insert image description here
We define the following parameters:

  • ϕ \phiϕ : classification function, acting onui B u_i^BuiB
  • D c D_c Dc: The red box in the picture;
  • D A B D_{AB} DAB: Blue box in the picture.

In federated transfer learning, only one party has a label. Through federated transfer learning, the intersecting data can be used to label the data of the unlabeled party.

A simple model is shown in the figure below:
Insert image description here
The loss function of the model (without considering regularization):
L = L 1 + γ L 2 L 1 = ∑ i N clog ( 1 + e − yi ϕ ( ui B ) ) L 2 = ∑ i NAB ∣ ∣ ui A − ui B ∣ ∣ F 2 \begin{aligned} &L=L_1+\gamma L_2\\ &L_1=\sum_i^{N_c}log(1+e^{-y_i\phi(u_i^B) })\\ &L_2=\sum_i^{N_{AB}}||u_i^A-u_i^B||^2_F \end{aligned}L=L1+γL2L1=iNclog(1+eyiϕ ( uiB))L2=iNABuiAuiBF2
We hope to minimize two parts of the above formula table name (taking a two-class classification problem as an example, the label values ​​are -1 and 1):

  • L 1 L_1L1: Represents the closeness to the real label, when yi = 1 y_i=1yi=When 1 , if the above formula is the minimumϕ ( ui B ) \phi(u_i^B)ϕ ( uiB) should be as close as possible to1 11,当 y i = − 1 y_i=-1 yi=When1 , it should be as close as possible to − 1 -11
  • L 2 L_2L2: Represents the similarity of the two model representations. It should be that the training number labels for training the two models are the same, so we hope that the two feature representations are as similar as possible.

3.4 Federated reinforcement learning

Federated Reinforcement Learning (Federated Reinforcement Learning) refers to the combination of reinforcement learning and federated learning. Federated Reinforcement Learning is divided into two types: vertical and horizontal. The definition is similar to the previous one. A simple model of Horizontal Federated Reinforcement Learning. As shown below:
Insert image description here
In the figure above, each participant conducts training according to their own environment. After the trained model is uploaded to the central server for aggregation, the server then issues the model to continue training.

The simple model of vertical federated reinforcement learning (Vertical Federated Reinforcement Learning) is as follows (the dotted line in the figure indicates that it may not exist):
Insert image description here
The process in the above figure is similar to the process of horizontal federated reinforcement learning.

4 Federated Learning and Distributed Machine Learning

Insert image description here
Personally, I think federated learning is actually a variant of distributed machine learning. Traditional distributed machine learning (also called expansion-oriented distributed machine learning) focuses on how to use it when hardware resources are insufficient. Distributed clusters train a huge model, while federated learning means that the data itself is on each node, but due to privacy protection, methods similar to distributed learning have to be used for learning (it seems that traditional distributed machines The learning effect is better than federated learning, because traditional distributed learning has all the data). The privacy-preserving distributed machine learning proposed later is a bit like the prototype of federated learning. Distributed machine learning for privacy protection means that the participants have different characteristics of the same data and hope to train a model under the premise of privacy protection (it can be seen that vertical federated learning is very much wanted). The later federated learning expands the distributed learning for privacy protection.

The Venn diagram of the two is roughly as follows:
Insert image description here

5 REFERENCE

"Federated Learning" by Yang Qiang et al.

Guess you like

Origin blog.csdn.net/qq_45523675/article/details/127861115