[Paper Notes] "Blockchained On-Device Federated Learning" Intensive Reading Notes

Basic information of the paper:

DOI: 10.1109/LCOMM.2019.2921755

table of Contents

1.INTRODUCTION

2. ARCHITECTURE AND OPERATION

2.1 one-Epoch BlockFL Operation

2.2 FL operation in BlockFL

3. END-TO-END LATENCY ANALYSIS

3.1 One-Epoch BlockFL Latency Model

3.2  Latency Optimal Block Generation Rate

4.NUMERICAL RESULTS AND DISCUSSION

5. Thinking


 

1.INTRODUCTION

Traditional federal learning has the following limitations:

(1) Relying on a single central server, which is easily affected by server failures;

(2) There is no appropriate reward mechanism to stimulate users to provide data training and upload model parameters.

In this regard, the author proposed [ Blockchain Federated Learning Based on Blockchain (BlockFL) ]:

(1) Replace the central server with a blockchain network, which allows the local model of the exchange device to be updated;

(2) Join verification and provide corresponding reward mechanism.

After joining the blockchain, the delay problem must be considered , because the higher the delay, the more forking phenomenon. The main reasons for the delay are as follows:

compution delays: train the local model and calculate the global model locally;

communication delays: upload local model, download local model, block propagation delay, block verification delay (can be ignored);

block Generation: Block generation delay.

In this regard , the author analyzed the delay caused by the blockchain network of BlockFL, studied the end-to-end learning completion delay of BlockFL, and considered to minimize the delay by adjusting the generation rate of the block, that is, the difficulty of POW, thereby increasing the system Practicality.

 

2. ARCHITECTURE AND OPERATION

2.1 one-Epoch BlockFL Operation

Fig.1.a is the architecture of traditional FL, so I won’t repeat it.

Fig.1.b is the architecture of BlockFL. The logical structure of BlockFL is composed of miners and equipment. Miners are physically randomly selected devices or individual nodes.

Each round of BlockFL training can be divided into 7 steps:

1) Local model update: The equipment D_{i}uses local samples to train local models. [Calculation formula (1)]

2) Local model upload: The device D_{i}uploads the local model and local calculation time to the associated miner. And the device D_{i}gets data rewards from associated miners.

3) Cross-validation: The miners broadcast the local model and verify it. If the verification is passed, it will be recorded in the miner's candidate block (until the block size is reached, or the maximum waiting time is reached).

4) Block generation: Each miner runs POW until it finds a nonce or receives a generated block.

5) Block propagation: The candidate block of the miner who finds the nonce for the first time is used as a new block and propagated to other miners. The miners get mining rewards from the blockchain network. In order to avoid chain forks, once each miner receives a new block, an ACK is sent, including whether forking has occurred. If forking occurs, the operation will restart from step 1), and the miner who generates a new block waits for a predefined and maximum block ACK waiting time.

6) Global model download: The device D_ {j}downloads new blocks from neighboring miners.

7) Global model update: The device D_ {j}calculates the global model update locally.

The end of the cycle conditions: \left | w^{L}-w^{L-1} \right |<\varepsilon.

 

2.2 FL operation in BlockFL

(1) set of devices: D=\left \{ 1,2,.....,N_{D} \right \}, \left | D \right |=N_{D}.        D_{i}Data samples S_{i}.

(2) FL model: This article solves the problem of parallel regression .

Data samples of all devices:S=\bigcup si, \left | si \right |=N_{s}

s_{k}=\left \{ x_{k},y_{k} \right \}Wherein x_{k}a d-dimensional column y_{k}\epsilon Rvector .

目标:minimize loss function f(w)=\frac{1}{N_{s}}\sum_{i=1}^{N_{D}}\sum_{s_{k}\epsilon s_{i}}^{-}f_{k}(w)

     

Like traditional FL, it uses random variance reduction gradient algorithm . The local model updates of all equipment are aggregated using the distributed quasi-Newton method .

 

 

3. END-TO-END LATENCY ANALYSIS

3.1 One-Epoch BlockFL Latency Model

The delay analysis has been described in the instruction, so it is omitted here.

 

3.2  Latency Optimal Block Generation Rate

The optimal "block generation rate" is derived. (The process is omitted)

Conclusion: If the block generation rate is too large \lambda, the occurrence rate of forking becomes larger, which in turn leads to a larger delay in completion of learning.

On the contrary, if the block generation rate is too small \lambda, the cost of generating blocks will increase, and the delay will also increase.

 

 

4.NUMERICAL RESULTS AND DISCUSSION

The numerical evaluation of the average completion learning delay of BlockFL.

 

 

Fig.3.a shows the effect of block generation rate \lambdaon the average learning completion delay of BlockFL

The delayed image is convex and decreases as the SNR (signal to noise ratio) increases.

Fig.3.b Description

In the case of the same number of devices, the accuracy of BlockFL and traditional FL are almost equal.

Fig.4.a shows that the learning completion delay of BlockFL is lower than that of traditional FL ( N_{M}=1).

  • N_{M}=1,10At the time, Gaussian noise was added to each miner’s local model update with a probability of 0.05 N(-0.1,0.01).
  • When there is no failure, the main reasons for the delay are: cross-validation and block propagation.
  • In BlockFL, each miner's failure only affects its connected equipment, and these equipment can obtain models from other interconnected normal miners to eliminate this impact.
  • More miners can get lower latency. ( N_{M}=10, When there is a fault)
  • There is a number of devices that minimizes latency. Too many devices can have more data sets that can be used, but it will increase the size of each block and the time of block exchange, resulting in "convex delay" as shown in the figure.

Fig.4.b sets the threshold for a device to become a miner \theta _{e}\epsilon \left [ 0,1 \right ].

When there is no failure, the learning completion delay becomes larger (because there are many miners when the threshold is low, and the cross-validation and block propagation delays are high)

When there is a failure, the delay is lower (because there are fewer miners and time-consuming is reduced)

 

Fig.4.c represents the probability that the chain of malicious miners is longer than the chain of honest miners.

It means that as long as a few blocks are "locked" by honest miners, the probability of being tampered with by malicious miners is almost zero.

5. Thinking

 Due to mobile devices, network delays, power failures and intermittent availability, all devices are T_{wait}within the local model is successfully uploaded not realistic.

Guess you like

Origin blog.csdn.net/Aibiabcheng/article/details/108859228