Latest paper notes (24): VCD-FL: Verifiable, Collusion-Resistant, and Dynamic Federated Learning

VCD-FL: Verifiable, Collusion-Resistant, and Dynamic Federated Learning
can be translated as " VCD-FL: Verifiable, Collusion-Resistant, and Dynamic Federated Learning "

Verifiable federated learning is generally understood as the verifiability of aggregation results, that is, the client can verify the correctness of the aggregation results returned by the server. Often the solution is to use homomorphic hashing, commitment or polynomial verification, but there is a common problem that cannot prevent the verification of collusion between the client and the server , that is to say, if a client and server collude, the server can Forging verification conditions, the biggest contribution of this VCD-FL is to propose a verifiable solution against collusion . The following is a little note recorded while watching. There may be misunderstandings. It is recommended to read the original text.

1. Background introduction

1.1 Classic FL process

  • (1) The server selects a set of clients that download the initial model from the server
  • (2) The client uses local data for model training and uploads model parameters (gradients)
  • (3) The server aggregates the model parameters of the client and returns the aggregation result to the client
  • (4) The client updates the local model according to the aggregation results, and iterates repeatedly until the model converges

1.2 There are problems with FL

  • (1) Model privacy issues : attackers can infer original information based on model parameters.
    Example: The least squares loss function is min ⁡ ∑ i = 1 nl ( w , xi , yi ) , wherel ( w , xi , yi ) = 1 2 ( xi T w − yi ) 2 min⁡\sum_{i=1 }^nl(w,x_i,y_i),where\quad l(w,x_i,y_i )=\frac{1}{2} (x_i^T w-y_i )^2mini=1nl(w,xi,yi),wherel(w,xi,yi)=21(xiTwyi)2,则随机梯度为:
    g i = ∂ l ( w , x i , y i ) ∂ w = ( x i T w − y i ) x i g_i=\frac{\partial l(w,x_i,y_i )}{\partial w}=(x_i^T w-y_i ) x_i gi=wl(w,xi,yi)=(xiTwyi)xiThe gradient is to scale (transform) the data and carry all the information of the data.
  • (2) Aggregation result verification problem : Malicious servers may intentionally return designed aggregation results to launch privacy attacks. (The purpose is to save computing overhead and steal private information)
  • (3) Anti-collusion verification problem : the client may conspire with the server to forge verification parameters.
    Example: In VerSA(TDSC21) 1 , each client receives ( z , z ˉ ) (z,\bar z)(z,zˉ ), parallel arithmeticz ′ = a ∘ z + ∣ U 3 ∣ ⋅ bz^{'}=a\circ z+|U_3 | bz=az+U3b, judgez ′ = z ˉ z^{'}=\bar{z}z=zˉIs it equal, wherea , ba,ba,b is known to all clients. As long as a client colludes with the server, the server can forge authentication conditions.
  • (4) Client offline problem : The client is offline due to subjective (early exit) or objective (network abnormality) reasons, resulting in failure to complete model aggregation and result verification.

1.3 Related work

  • (1) SA(CCS17) 2 proposes double masking and secret sharing to protect gradient privacy and tolerate user disconnection, but cannot verify the aggregation results.
  • (3) VerifyNet (TIFS20) 3 is the first proposal to propose verifiable federated learning, using homomorphic hash functions and pseudo-random techniques to provide verifiable functions, as well as secret sharing and key agreement to protect gradient privacy. However, verification overhead is high, and it cannot resist server and client collusion verification.
  • (3) VFL (TII22) 4 proposed for the first time to use Lagrangian interpolation to achieve verifiability, and use blinding technology to achieve privacy protection against collusion. However, assuming that the server is honest and curious, malicious behavior is not considered, and the calculation and communication overhead of verification is relatively large.
  • (4) VeriFL (TIFS21) 5 optimizes the secure aggregation protocol in SA (CCS17) [2] by proposing gradient hash commitment and amortized verification mechanism. However, collusion proof and collusion detection are not resistant.
  • (5) VFChain (TNS20) 6 proposes to choose a blockchain-based effective committee common model aggregation and verifiability. However, the efficiency and scalability of the blockchain are poor and impractical.
    insert image description here

1.4 Contribution of this article

  • (1) Anti-collusion verification : A lightweight commitment scheme with irreversible gradient transformation is proposed to protect privacy, and an efficient verification mechanism based on optimized Lagrangian interpolation is designed to prevent aggregation results from passing verification with overwhelming probability.
  • (2) Recognizing malicious behavior : Established malicious behavior detection rules to detect whether the server participates in a collusion attack to pass the verification, or the lazy server returns incorrect aggregation results.
  • (3) Support federated dynamic verification : Proposes a combination of verification mechanism and Shamir secret sharing scheme to tolerate user disconnection.
  • (4) Low computational and communication overhead : A new method is designed to generate interpolation points for Lagrangian interpolation, thereby reducing computational overhead. Communication overhead is further reduced by gradient compression.

1.5 Lagrange interpolation

Lagrangian interpolation refers to the method of exactly constructing a polynomial through the given data points. Suppose nSet of n data points { ( xi , yi ) } i = 1 n \{(x_i,y_i )\}_{i=1}^n{(xi,yi)}i=1n, among which xi x_ixiare all different, and can fit an order not greater than n − 1 n-1n1 -unique polynomial
L ( x ) = ∑ i = 1 nyi L i ( x ) L(x)=\sum_{i=1}^n y_i L_i (x)L(x)=i=1nyiLi( x ) where base polynomialL i ( x ) L_i (x)Li(x)定义为
L i ( x ) = ∏ j = 1 , j ≠ i n x − x j x i − x j , i ∈ { 1 , 2 , … , n } L_i (x)=∏_{j=1,j≠i}^n \frac{x-x_j}{x_i-x_j},i∈\{1,2,…,n\} Li(x)=j = 1 , j=inxixjxxj,i{ 1,2,,n} L i ( x ) L_i (x) LiThe properties of ( x )
satisfy L i ( xj ) = { 1 , i = j 0 , i ≠ j L_i(x_j)=\begin{cases} 1, & i=j\\ 0, & i\neq j \end {cases}Li(xj)={ 1,0,i=ji=jTherefore, the Lagrangian polynomial L ( x ) L(x)L(x)满足 L ( x i ) = y i L(x_i )=y_i L(xi)=yi

2. System model

2.1 Design goals

  • (1) Robust result verification : The scheme should ensure the robustness of correctness verification, which can not only support anti-collusion privacy protection, but also resist collusion verification , and ensure the correctness of dynamic FL caused by client disconnection.
  • (2) Malicious Behavior Classification : The scheme should discover the root cause of incorrect aggregation results , which helps to take targeted punishment measures. (malicious or lazy)
  • (3) Efficient model operations : The scheme should enable clients to efficiently perform model operations (local training and aggregated result verification), as well as reduce communication overhead.
  • (4) Lightweight privacy protection : The scheme should protect the client's privacy from inference attacks and collusion attacks , while reducing some computationally intensive operations to improve practicability.
    insert image description here

2.2 Threat Model

  • TA : Responsible for system initialization, assigning parameters with PRG as the client, completely credible;
  • Clients : Honest and curious, some corrupt clients may conspire with a malicious AS to pass authentication.
  • AS : Responsible for the collection and aggregation of ciphertext, and send the aggregation result to the client for verification in each iteration. It is considered malicious and will launch inference attacks (peeping privacy) and forgery attacks (breaking usability). AS is divided into two categories by function:
    • Weak attack model: AS is lazy, it reduces the number of iterations, or collects partial gradients to save overhead, and it will launch inference attacks.
    • Strong attack model: AS hides the modification of the aggregation result, and will collude with the client to forge the aggregation result to deceive other clients.

3. The specific structure of VCD-FL

In order to realize the anti-collusion verification and collusion attack detection that cannot be achieved by existing schemes , the main steps of VCD-FL include four parts : initialization , local model training , ciphertext aggregation , and aggregation result verification .

3.1 Initialization (Algorithm 1)

In this paper, we improve the VFL scheme by grouping gradients. will each have ddd- dimensional gradientgi g_igiis divided into ⌈ d M ⌉ ⌈\frac{d}{M}⌉Md groups, each group containsMMM gradient elements (MMM is an integer that determines the degree of the Lagrange interpolation function). If the last set of gradient elements is less thanMMM , filled with 0s. TA generates the following parameters:

  • (1) In any two clients P i P_iPiand P j P_jPjGenerate a pair of seeds si , j s_{i,j}si,jUsed for gradient masks.
  • (2) Generate an additional random seed ρ i ρ_iri, and ρ i ρ_iriA share of is allocated to each client (Shamir secret sharing) to handle client dropouts.
  • (3)利用 P R G ( ρ i ) PRG(ρ_i) PRG ( ρi) to generate a sequence setA i A_iAi, and normalized to improve interpolation accuracy
    A i ← PRG ( ρ i ) max { ∣ PRG ( ρ i ) ∣ } , i ∈ { 1 , 2 , . . . , N } A_i\leftarrow \frac{PRG(ρ_i )}{max\{|PRG(ρ_i)|\}},i\in \{1,2,...,N\}Aimax{ PRG(ρi)}PRG ( ρi),i{ 1,2,...,N}
  • (4) Generate a random integer sequence Z = { ai ∣ i = 1 , 2 , . . . , ⌈ d M ⌉ ( M + 1 ) } Z=\{a_i|i=1,2,...,⌈ \frac{d}{M}⌉(M+1)\}Z={ aii=1,2,...,Md(M+1 )} as the set of interpolated points (public).
  • (5) Generate a size of M × MM\times MM×The singular square matrix UUof MU is used for promise generation (public).
    insert image description here

3.2 Local model training

Client P i ∈ P P_i\in PPiP downloads the global model to initialize the local model, uses the mini-batch gradient descent method for model training, and calculates the gradientgi g_igi
g i = ∇ w l ( w ; D i ′ ) g_i=\nabla_w l(w;D^{'}_i) gi=wl(w;Di) AfterP i P_iPiGradient encryption, grouping, and commitment are performed sequentially.

3.2.1 Gradient Encryption

  • (1) First use the single masking protocol to blind the gradient ( si , j s_{i,j}si,j由TA直接分配
    g i ′ = g i + ∑ P j ∈ P , i < j P R G ( s i , j ) − ∑ P j ∈ P , i > j P R G ( s i , j ) g^{'}_i=g_i+\sum_{P_j\in P,i<j} PRG(s_{i,j})-\sum_{P_j\in P,i>j} PRG(s_{i,j}) gi=gi+PjP,i<jPRG(si,j)PjP,i>jPRG(si,j)
  • (2) Use Lagrange interpolation to process the blinded gradient. For blinding gradient gi ′ g_i^{'}giTo group, each client P i P_iPiBoth generate the kkthLagrange interpolation set of k groups, wherek ∈ { 1 , 2 , … , ⌈ d / M ⌉ } k ∈ \{1,2,…,⌈d/M⌉\}k{ 1,2,,d / M ⌉} , exMMM points are{ ( a ( k − a ) ( M + 1 ) + j , gi ′ ( ( k − 1 ) M + j ) ) ∣ j = 1 , 2 , . . . , M } \{(a_ {(ka)(M+1)+j},g^{'}_i((k-1)M+j))|j=1,2,...,M\}{(a(ka)(M+1)+j,gi((k1)M+j))j=1,2,...,M } , the firstM+1 M+1M+1 point is( ak ( M + 1 ) , A i ( k ) ) (a_{k(M+1)},A_i(k))(ak(M+1),Ai(k))
    insert image description here
  • (3) Function fi , [ k ] f_{i,[k]}fi,[k]is the kkthThe Lagrange interpolation set of group k
    is calculated as fi , [ k ] ( x ) = ∑ j = ( k − 1 ) M + 1 k M [ L j , [ k ] ( x ) gi ′ ( j ) ] + [ A i ( k ) ∏ h = ( k − 1 ) ( M + 1 ) + 1 k ( M + 1 ) − 1 x − ahak ( M + 1 ) − ah ] f_{i,[k]}(x)= \sum^{kM}_{j=(k-1)M+1}[L_{j,[k]}(x)g^{'}_i(j)]+[A_i(k)\prod^ {k(M+1)-1}_{h=(k-1)(M+1)+1}\frac{x-a_h}{a_{k(M+1)}-a_h}]fi,[k](x)=j=(k1)M+1kM[Lj,[k](x)gi(j)]+[Ai(k)h=(k1)(M+1)+1k(M+1)1ak(M+1)ahxah]
    其中
    L j , [ k ] ( x ) = ∏ h = ( k − 1 ) ( M + 1 ) + 1 , h ≠ j + k − 1 k ( M + 1 ) − 1 x − a h a j + k − 1 − a h L_{j,[k]}(x)=\prod^{k(M+1)-1}_{h=(k-1)(M+1)+1,h\neq j+k-1}\frac{x-a_h}{a_{j+k-1}-a_h} Lj,[k](x)=h=(k1)(M+1)+1,h=j+k1k(M+1)1aj+k1ahxah
  • (4) P i P_i PiWill combine the coefficient vector B i B_iBiUpload to AS as gradient ciphertext
    B i = ( B i , [ 1 ] , B i , [ 2 ] , . . . , B i , [ ⌈ d / M ⌉ ] ) B_i =(B_{i,[1] },B_{i,[2]},...,B_{i,[⌈d/M⌉]})Bi=(Bi,[1],Bi,[2],...,Bi,[⌈d/M⌉])
    of whichB i , [ k ] B_{i,[k]}Bi,[k]means from fi , [ k ] ( x ) f_{i,[k]}(x)fi,[k]M+1extracted from ( x ) M+1M+1 factor, according toxxThe degree of x from small to large is B i , [ k ] = ( b 0 , < i , [ k ] > , b 1 , < i , [ k ] > , . . . , b M , < i , [ k ] > ) B_{i,[k]}=(b_{0,<i,[k]>},b_{1,<i,[k]>},...,b_{M,<i, [k]>})Bi,[k]=(b0,<i,[k]>,b1,<i,[k]>,...,bM,<i,[k]>)
    Obviously, the secrecy of the Lagrange interpolation set can enhance the secrecy of the gradient. Even leaked, as long as at least two clients do not collude with AS. (The number of Lagrange basis polynomials in this article is about( M + 1 ) ⌈ d / M ⌉ (M+1) ⌈d/M ⌉(M+1 ) d / M units, while in the VFL scheme it is( M + 1 ) d (M+1)d(M+1 ) d )

3.2.2 Promise generation (Algorithm 2)

To prevent spoofing by corrupt clients during aggregation result verification, a lightweight commitment scheme using irreversible gradient transformations is proposed.
-(1) First, the gradient gi g_igiDivided into ⌈ d / M ⌉ ⌈ d/M ⌉d / M group, each group containsMMM gradient elements can be written asgi = ( gi , [ 1 ] , gi , [ 2 ] , . . . , gi , [ ⌈ d / M ⌉ ] ) g_i=(g_{i,[1]},g_{ i,[2]},...,g_{i,[⌈d/M⌉]})gi=(gi,[1],gi,[2],...,gi,[⌈d/M⌉]) ,forgi , [ k ] = ( gi ( ( k − 1 ) M + 1 ) , gi ( ( k − 1 ) M + 2 ) , . . . , gi ( k M ) ) T g_{i,[ k]}=(g_i((k-1)M+1), g_i((k-1)M+2),...,g_i(kM))^Tgi,[k]=(gi((k1)M+1),gi((k1)M+2),...,gi(kM))T
insert image description here

  • (2) Client P i P_iPiFor gi , [ k ] g_{i,[k]}gi,[k]Make a commitment C i , [ k ] = U ⋅ gi , [ k ] C_{i,[k]}=U·g_{i,[k]}Ci,[k]=Ugi,[k]Among them UUU size isM × MM × MM×M irreversible singular square matrix guarantees gradient privacy, gi , [ k ] g_{i,[k]}gi,[k]is the kkthMM of k gradient groupsM- dimensional column vector. This multiplication calculation is lightweight and can be processed in parallel for efficiency.
  • (3) after, P i P_iPi广播 C i = ( C i , [ k ] ) k = 1 ⌈ d / M ⌉ ] C_i=(C_{i,[k]})^{⌈d/M⌉]}_{k=1} Ci=(Ci,[k])k=1d/M⌉], in the B i B_iBiGot promises ( public ) from other clients before uploading to AS.
    insert image description here

3.2.3 Interpolation optimization (Algorithm 3)

In order to reduce the interpolation frequency without affecting the model accuracy, deep gradient compression7 is introduced to optimize the gradient.

  • (1) Usually use momentum factor 0.5 to calculate G i G_iGi G i G_i GiThe initial value of G i = gi + 0.5 ⋅ G i G_i=g_i+0.5·G_iGi=gi+0.5Gi
  • (2) from G i G_iGiSelect pp% inThe p elements with the largest absolute value are used as the optimization gradient, and put them intogi g_igiat the same position and set gi g_igiThe remaining elements are 0.
    insert image description here
  • To avoid loss of information, G i G_iGiUnselected elements in are accumulated locally until the absolute value is sufficiently large.
  • Optimized gradient gi g_igiWill greatly reduce the interpolation calculation overhead, because gi ′ ( j ) = 0 g_i^{'}(j)=0gi(j)=0 interpolation point pairB i B_iBiNo effect, P i P_iPiJust calculate gi ′ ( j ) ≠ 0 g_i^{'}(j)≠0gi(j)=0 Lj , [ k ] ( x ) L_{j,[k]}(x)Lj,[k](x)

3.3 Ciphertext Aggregation

  • (1) AS performs the aggregation until it gets P_i from each client P iPiReceived ciphertext B i B_iBiSo far, and calculate BBB B = ∑ i = 1 N B i = ( ∑ i = 1 N B i , [ 1 ] , . . . , ∑ i = 1 N B i , [ ⌈ d / M ⌉ ] ) B=\sum^N_{i=1}B_i=(\sum^N_{i=1}B_{i,[1]},...,\sum^N_{i=1}B_{i,[⌈d/M⌉]}) B=i=1NBi=(i=1NBi,[1],...,i=1NBi,[⌈d/M⌉])
  • (2)AS将 B B B distributed to each clientP i P_iPi. Since the ciphertext B i B_iBiis according to gi ′ g^{'}_igiSum A i A_iAiCalculated, which guarantees the original gradient gi g_igiwill not be inferred. Even if AS colludes with N-2 clients, it is difficult to obtain s_(i,j) of the other two clients, which prevents AS from obtaining gradients.

3.4 Aggregation result verification

Each client P i P_iPiUse received BBB to get the gradient aggregation result and verify it with the previous commitment. details as follows:

3.4.1 Gradient decryption

  • (1) P i P_i PiFirst use B [ k ] B_{[k]}B[k]to reconstruct the kkthAggregate interpolation functionf [ k ] ( x ) f_{[k]}(x) of group kf[k](x) f [ k ] ( x ) = ∑ m + 1 M + 1 B [ k ] ( m ) x M − m + 1 f_{[k]}(x)=\sum^{M+1}_{m+1}B_{[k]}(m)x^{M-m+1} f[k](x)=m+1M+1B[k](m)xM m + 1 of whichB [ k ] ( m ) B_{[k]}(m)B[k]( m ) meansB [ k ] B_{[k]}B[k]mm inm elements,k ∈ { 1 , 2 , . . . , ⌈ d / M ⌉ } k\in\{1,2,...,⌈d/M⌉\}k{ 1,2,...,d/M⌉}
    insert image description here
  • (2) P i P_i Piin integer sequence ZZZ is the input, usef [ k ] ( x ) f_{[k]}(x)f[k]( x ) The aggregation result of the reconstructed gradientggg , and remove the inserted sequenceA i A_iAiand ⌈d/M ⌉ ⌈d/M⌉d/M个组,用0填充,即 g = ∑ i = 1 N g i = ∑ i = 1 N g ′ i = ( f [ k ] ( a ( k − 1 ) M + k ) , . . . , f [ k ] ( a k ( M + 1 ) − 1 ) ) T g=\sum^N_{i=1}g_i=\sum^N_{i=1}g{'}_i=(f_{[k]}(a_{(k-1)M+k}),...,f_{[k]}(a_{k(M+1)-1}))^T g=i=1Ngi=i=1Ngi=(f[k](a(k1)M+k),...,f[k](ak(M+1)1))T

3.4.2 Result verification

In order to protect the gradient privacy and verify the correctness of the aggregation result g, the basic idea is to judge whether the following formula is true R ⋅ g = ∑ i = 1 NR ⋅ gi R·g=\sum^N_{i=1}R·g_iRg=i=1NRgiwhere RRR is forddd- dimensional vector. Ifg ≠ ∑ i = 1 N gig\neq\sum^N_{i=1}g_ig=i=1Ngi, obviously the above formula is not satisfied. ( but does not satisfy collusion-resistant verification )
Further, an effective verification mechanism is designed on the basis of the previous commitment .

  • (1) P i P_i Pi先将 A i = { A i ( k ) } k = 1 ⌈ d / M ⌉ A_i=\{A_i(k)\}^{⌈d/M⌉}_{k=1} Ai={ Ai(k)}k=1d/Mand a random row vector R i = ( ri , 1 , ri , 2 , . . . , ri , M ⋅ ⌈ d / M ⌉ ) R_i=(r_{i,1},r_{i,2},... ,r_{i,M·⌈d/M⌉})Ri=(ri,1,ri,2,...,ri,Md/M) Division and other customer calculations R = ∑ i = 1 NR i = ( ∑ i = 1 N ri , 1 , ∑ i = 1 N ri , 2 , . . . , ∑ i = 1 N ri , M ⋅ ⌈ d / M ⌉ ) R=\sum^N_{i=1}R_i=(\sum^N_{i=1}r_{i,1},\sum^N_{i=1}r_{i,2} ,...,\sum^N_{i=1}r_{i,M⌈d/M⌉})R=i=1NRi=(i=1Nri,1,i=1Nri,2,...,i=1Nri,Md/M) Each semi-real client simultaneously broadcastsR i R_iRiand does not participate in the conspiracy, then RRR is unpredictable.
  • (2) To prevent corrupt clients from helping malicious ASs pass the verification. P i P_iPiRRs are grouped in the same wayR performs grouping and computes a random group row vectorSSS as the verification coefficient vector S = ( S [ 1 ] , S [ 2 ] , . . . , S [ ⌈ d / M ⌉ ] ) S=(S_{[1]},S_{[2]},.. .,S_{[⌈d/M⌉]})S=(S[1],S[2],...,S[⌈d/M⌉]) Among them, thekkthk group vectorsS [ k ] S_{[k]}S[k]Calculated as S [ k ] = R [ k ] ⋅ U S_{[k]}=R_{[k]} US[k]=R[k]U
    insert image description here
  • (3) In order to achieve effective verification, P i P_iPiCompute the checksum vi , k v_{i,k} for each group in parallelvi,k. for the kkthk pairs,vi , k v_{i,k}vi,k计算为 v i , k = R [ k ] ⋅ C i , [ k ] , k ∈ { 1 , 2 , . . . , ⌈ d / M ⌉ } v_{i,k}=R_{[k]}·C_{i,[k]},k\in \{1,2,...,⌈d/M⌉\} vi,k=R[k]Ci,[k],k{ 1,2,...,d/M⌉}
  • (4) After that, P i P_iPiCalculation V i = ∑ k = 1 ⌈ d / M ⌉ vi , k V_i=\sum^{⌈d/M⌉}_{k=1}v_{i,k}Vi=k=1d/Mvi,k A i = { A i ( k ) } k = 1 ⌈ d / M ⌉ A_i=\{A_i(k)\}^{⌈d/M⌉}_{k=1} Ai={ Ai(k)}k=1d/Mauthenticating. (The following are malicious behavior detection rules)
    Specifically, P i P_iPiCheck the following two equations f [ k ] ( ak ( M + 1 ) ) = ? ∑ i = 1 NA i ( k ) , k ∈ { 1 , 2 , . . . , ⌈ d / M ⌉ } ——( 1 ) f_{[k]}(a_{k(M+1)})=?\sum^N_{i=1}A_i(k),k\in\{1,2,...,⌈d/ M⌉\} —— (1)f[k](ak(M+1))=?i=1NAi(k),k{ 1,2,...,d/M⌉}——1 ∑ m = 1 d S ( m ) g ( m ) = ? ∑ i = 1 N V i ——( 2 ) \sum^d_{m=1}S(m)g(m)=?\sum^N_{i=1}V_i ——(2) m=1dS(m)g(m)=?i=1NVi—— ( 2 ) Among themf [ k ] ( ak ( M + 1 ) ) f_{[k]}(a_{k(M+1)})f[k](ak(M+1)) g ( m ) = f [ ⌈ m / M ⌉ ] ( a m + ⌈ m / M ⌉ − 1 ) g(m)=f_{[⌈m/M⌉]}(a_{m+⌈m/M⌉-1}) g(m)=f[⌈m/M⌉](am+m/M1) from the returned ciphertextBBB calculates and defines the rule as:
  • 1) Rule 1: If the above formula (1) and formula (2) are both established, the AS is considered trustworthy.
  • 2) Rule 2: If neither formula (1) nor formula (2) holds true, then AS is considered as a weak attacker.
  • 3) Rule 3: If the above formula (1) is true, but formula (2) is not true, then AS is considered as a strong attacker.

3.4.3 Support dynamic verification

Due to network abnormalities, some clients may be disconnected during the training process. There are already schemes solved using double masking ( provided that no client reveals the same client's online or offline secret share ). But the client may collude with the AS.
The following demonstrates how the verification process of this article handles the verification problem of client disconnection:

  • (1) f [ k ] ( a k ( M + 1 ) ) f_{[k]} (a_{k(M+1)}) f[k](ak(M+1)) g ( m ) = f [ ⌈ m / M ⌉ ] ( a m + ⌈ m / M ⌉ − 1 ) g(m)=f_{[⌈m/M⌉]} (a_{m+⌈m/M⌉-1}) g(m)=f[⌈m/M⌉](am+m/M1) The ciphertext BBreturned by ASB calculation, not affected by dropped clients.
  • (2) Gradient commitment C i C_iCiHas been published before the aggregation, R can use the random vector R i R_i of the online clientRicalculation, does not affect SSS sumV i V_iVicalculation process.
  • (3) Take Shamir ( T , N ) (T,N)(T,N ) Share P i P_iin a secret sharing mannerPiρ i ρ_iri, even if P i P_iPiExit, and can also maintain with the remaining clients (satisfied that the number is not less than TTT ) verification process without affectingA i A_iAicalculation.
    • As long as there are clients that do not participate in the collusion, falsified aggregation results can be detected.
    • Due to the unpredictable R generated S is random, even if N − 1 N-1NIt is also difficult for a client to conspire with the AS to forge the aggregation result.

4. Experimental evaluation

4.1 Experimental environment

Implemented using Python 3.8, PyTorch1.8.1 and Numpy, the classification task is performed on the MNIST dataset, and the multi-layer perceptron (MLP) and convolutional neural network (CNN) are used as the training model of VCD-FL respectively. The accuracy, loss, encryption overhead, decryption overhead, and communication overhead are mainly compared .

  • (1) Accuracy : When MLP and CNN are used as training models respectively, the accuracy of VCD-FL has advantages over the VFL scheme.
    • MLP: After 300 iterations, the VCD-FL accuracy is about 90.92%, and the VFL accuracy is about 88.9%.
    • CNN: After 500 iterations, the accuracy of VCD-FL is about 94.83%, and the accuracy of VFL is about 93.38%.
      insert image description here
  • (2) Loss : Also under MLP and CNN as training models, the loss of VCD-FL is smaller than that of the VFL scheme.
    insert image description here
  • (3) Encryption overhead : (a) When M ′ = 4 , d = 100 , 000 M^{'}=4,d=100,000M=4,d=100,000 , VCD-FL encryption time is about 20.44s, VFL is about 96.91s, only 1/5 times. After introducing the gradient compression algorithm, with the compression ratiopp%The cost of increasing p is reduced. (b) When givend = 10000 d=10000d=10000 , withM ′ M{'}M' , VFL encryption time increases almost exponentially, while VCD-FL only increases linearly.
    insert image description here
  • (4) Decryption overhead : (a) With ddWith the increase of d , the decryption time increases linearly. WhenM ′ = 4 , d = 100 , 000 M^{'}=4,d=100,000M=4,d=100,000 , VCD-F decryption time is about 1.63s, VFL is about 8.7s. (b) Similarly, fixedddd , withM ′ M{'}M' , the decryption time of VFL increases linearly, while that of VCD-FL decreases linearly. The reason is that the decryption overhead depends on the calculation of the aggregation result, that is, depends on the number of interpolation points, and VFL is about( M ′ + 1 ) d (M^{'}+1)d(M+1 ) d , while VCD-FL is aboutd + ⌈ d / M ′ ⌉ d+⌈d/M^{'}⌉d+d/M.
    insert image description here
    -(5)Communication overhead: the size of the uploaded information between the client and the AS, mainly depends onddd andM' M^{'}M' . Generally speaking, the communication overhead of VCD-FL is smaller than that of VFL.
    insert image description here

references

[1] Hahn C, Kim H, Kim M, et al. Versa: Verifiable secure aggregation for cross-device federated learning[J]. IEEE Transactions on Dependable and Secure Computing, 2021.
[2] K. Bonawitz et al., “Practical secure aggregation for privacy-preserving machine learning,” in Proc. ACM SIGSAC Conf. Comput. Commun. Secur., Oct. 2017, pp. 1175–1191.
[3] G. Xu, H. Li, S. Liu, K. Yang, and X. Lin, “VerifyNet: Secure and verifiable federated learning,” IEEE Trans. Inf. Forensics Security, vol. 15, pp. 911–926, 2020.
[4] A. Fu, X. Zhang, N. Xiong, Y. Gao, H. Wang, and J. Zhang, “VFL: A verifiable federated learning with privacy-preserving for big data in industrial IoT,” IEEE Trans. Ind. Informat., vol. 18, no. 5, pp. 3316–3326, May 2022.
[5] X. Guo et al., “VeriFL: Communication-efficient and fast verifiable aggregation for federated learning,” IEEE Trans. Inf. Forensics Security, vol. 16, pp. 1736–1751, 2021.
[6] Z. Peng et al., “VFChain: Enabling verifiable and auditable federated learning via blockchain systems,” IEEE Trans. Netw. Sci., vol. 9, no. 1, pp. 173–186, Jan. 2020.
[7] Y. Lin et al., “Deep gradient compression: Reducing the communication bandwidth for distributed training,” in Proc. 6th Int. Conf. Learn. Represent., Vancouver, BC, Canada, 2018, pp. 1–14.


  1. 1 ↩︎

  2. 2 ↩︎

  3. 3 ↩︎

  4. 4 ↩︎

  5. 5 ↩︎

  6. 6 ↩︎

  7. 7 ↩︎

Guess you like

Origin blog.csdn.net/A33280000f/article/details/132423817