Artificial intelligence privacy protection computing model based on cryptography technology

There are various application scenarios for artificial intelligence privacy protection. In different scenarios, the credibility and number of entities that complete privacy protection calculations
are not the same. The trustworthiness of these entities The degree and quantity of trustworthiness have an important impact on the practical application of privacy protection calculation methods. This article classifies artificial intelligence privacy protection calculation methods based on cryptography based on the trustworthiness and quantity of entities. There are 4 types of computing models, namely multi-center model, dual-center model, single-center model and reality model. In addition to the reality model, there are trusted entities in other computing models. For each This paper gives the typical calculations and typical algorithms used in the current artificial intelligence privacy protection method based on cryptographic tools, and points out that improving the efficiency and security of the algorithm is A research direction applicable to every computing model. Keywords: artificial intelligence; privacy protection; computing model; algorithm; protocol; cryptography technology CLC number: TP309.2 Document identification code: A Article number: 0372-2112(2023)08-2260-17 (No.2020B010166005); Huawei Technologies Co., Ltd. (No.TC20210407007, No.YBN2019105017) Fund projects: Guangdong Province Basic and Applied Basic Research Major Project (No.2019B030302008); Guangdong Province Key Areas R&D Plan Project A Survey: Computing Models of Artificial Intelligence Privacy Protection< a i=12> Based on Cryptographic Techniques TIAN Hai-bo, LIANG Xiu-qi (School of Computer Science and Engineering, Sun Yat-Sen University, Guangzhou , Guangdong 510275, China) Abstract: The application scenarios of artificial intelligence privacy protection are diverse. In different scenarios, the trustness and number of entities fulfilling privacy protection computation are different. The trustness and number of these entities have an important impact on the technical choices of privacy protection computation. Starting from the trustness and number of entities , this paper classifies the computation methods of artificial intelligence privacy protection, which are based on cryptographic techniques into four types of computation models: multiple centers model, double centers model, single center model and real model. Except for the real model, there are trusted entities in all other computation models. For each kind of computation model, this paper presents the typical computations and algorithms, which are involved in the current artificial intelligence privacy protection methods based on cryptography tools. And this paper also points out that improving the efficiency and security of algorithms is an applicable research direction for each model. Key words: artificial intelligence; privacy protection; computation models; algorithms; protocols; cryptographic techniques Foundation Item(s): Guangdong Major Project of Basic and Applied Basic Research (No.2019B030302008); Key- Area Research and Development Program of Guangdong Province (No.2020B010166005); Huawei Technologies Co., Ltd. (No.TC20210407007, No.YBN2019105017) 1 Introduction Artificial intelligence privacy protection has real needs . Artificial intelligence computing has many application scenarios. For example, a bank can build a model based on user data and then use the model to evaluate the user's credit. Bank and telecommunications companies can build better models through the data of the two parties to conduct a more accurate assessment of user credit. Multiple hospitals can use Build models about diseases through the data of respective users to better complete diagnosis and treatment services. In different application scenarios, the privacy of models and data The needs for protection are different. The need for model privacy protection often comes from




































Its commercial value, because a model with high prediction accuracy is often obtained after
a large amount of capital and manpower investment, and can be used as a
Service[1]
Provided to consumers or partners. Data privacy protection
Received date: 2021-06-04; Revised date : 2022-07-25; Editor in charge: Wang Tianhui
Issue 8 Tian Haibo: Overview: Artificial Intelligence Privacy Protection Computing Model Based on Cryptography Technology
The demand actually comes from Various laws and regulations. Our country has clarified personal privacy data and network in the "Civil Code of the People's Republic of China" and the "Cybersecurity Law of the People's Republic of China" Operators should follow the legal provisions when using these data. This forces the data provider to attach various usage policies to the data [2] There are 4 categories: multicenter, dual center, single center model and realistic model. Now Specifically, we attribute the calculation of current artificial intelligence privacy protection or application. into several computing models based on the number and credibility of service providers, and then the typical calculations performed under different computing models can be sorted out. and typical algorithms based on cryptographic tools when completing computing tasks, allowing readers to quickly understand the current progress of certain types of computing models based on actual needs. and the main cryptographic technologies used to prepare for further research The protection scheme can be divided Based on the above facts, we believe that the current artificial intelligence privacy differ in the number and trustworthiness of service providers. However, their tasks are similar. Borrowing from Secure Multi-Party Computation The concept of The privacy protection calculation of the service provider can be regarded as two parties that do not trust each other, and the TTP completion function F is completed through the TTP simulated by the two service providers. is the calculation process of a certain artificial intelligence algorithm. assumptions are made on the service provider to obtain a solution that meets the actual needs. These solutions that is, the service provider is introduced and certain confidence actual needs. There is a large gap. In artificial intelligence privacy protection, the scale of data that needs to be protected often exceeds 10,000, and the scale of model parameters that need to be protected often exceeds one million. . When we put secure multi-party computing protocols that work well on small-scale data into a practical environment for artificial intelligence privacy protection, the communication of these protocols The amount or calculation amount is too large, causing the performance of the protocol to often fail to meet actual needs. We have noticed that the current artificial intelligence privacy protection calculation is in fact adopted A compromise idea is introduced, In terms of artificial intelligence privacy protection, there is still a gap between the theory of secure multi-party computing and Participants directly simulate the functions of TTP and obtain corresponding output. can allow various . These technologies protocol, zero-knowledge proof, etc. The complete technical route [5] BGW (Ben-Or, Goldwasser, Wigderson) protocol, selection-split inadvertent transmission, and GMW. (Goldreich, Micali, Wigderson) protocol, TTP function Methods. Secure multi-party computation started from Mr. Yao's millionaire problem. After 40 years of development, it has formed methods based on obfuscated circuits, such a TTP in reality, so the main task of secure multi-party computation is to provide simulation output yi safely to Participants Pi, 1 ≤ i ≤ n. It is difficult to find TTP, the TTP runs the function F, obtains the output (y1 y2 yn), and then sends the Third Party), and each participant keeps private data secure is transmitted to the . In the ideal model, there is a trusted third party TTP (Trusted Cover[4] Secure multi-party computation defines the ideal model to clarify the security content prediction. protect the privacy of data and models through secure multi-party computation. Below, the completion function F is the calculation of training or private data xi. Multiple participants can task is completed, the participants Pi can obtain the output yi without leaking their own . After the calculation ( y1 y2 yn ) = F(x1  x2 xn ), where xi is the private data of participant Pi, 1 ≤ i ≤ n, F represents any value that can be calculated under the Turing machine model. 32>'s function, (y1 y2 yn) is the output result of this function [4] [P1 P2 Pn ] can complete the calculation task of a certain function F choice. Secure multi-party computation enables n parties that do not trust each other is a natural technology Secure multi-party computation in the field of cryptography [3] Computations completed by multiple participants form a multi-party computing scenario. Bonded learning. parties, single service provider and can Horizontal connection to protect user data privacy provider. Assuming that the service provider is semi-honest, it can realize multi-parameter calculation for dual service providers. Another example is that multiple hospitals form multiple participants. In order to perform privacy protection calculation, one method is Introduce a federated average algorithm service and privacy computing do not collude, dual participants based on homomorphic encryption and other technologies can be realized , privacy protection. The number and degree of trustworthiness of entities that complete the privacy protection calculation are different. For example, banks and Telecom companies are two participants. In order to perform privacy-protecting calculations, one method is to introduce a privacy computing service provider and a key. Distribution service provider. Assuming that the two service providers of key distribution , complete the training of the model or complete the prediction on the existing model The calculation of artificial intelligence privacy protection mainly refers to the condition of ensuring the privacy of models and data to meet the requirements of laws and regulations.








































































There is no service provider and its confidence assumptions in the real model. Other types of
The informal description of the calculation model is as follows.
(1) Polycentric model: There are multiple service providers [S1 S2 Sm],
m > 2, cooperating to complete the calculation of the F function.
(2) Dual-center model: There are two service providers S1 and S2, which work together to complete the calculation of the F function. (3) Single-center model: There is a service provider S that completes the calculation of function F. The confidence assumptions of these service providers include non-collusion, trustworthiness, semi- Honesty, etc. For example, the key distribution center is generally trustworthy and will not collude with other service providers; some of the multiple service providers are trustworthy, will not leak user data, etc. In specific application scenarios, some assumptions can be satisfied. For example, multiple competing companies jointly serve users Providing a certain service, based on the user's own degree of trust in the company, can realize the assumption that some of the multiple service providers are trustworthy. Some assumptions are not easy to satisfy, such as a trusted key distribution center. Therefore, the computing model to which the privacy protection scheme belongs can reflect its use in certain In the scenario whether it can actually be implemented and whether it can complete the task of privacy protection. 2 Problems and Challenges At present, artificial There have been many high-quality reviews on intelligent privacy-preserving computing. Most of these reviews focus on different technologies and strategies or application scenarios adopted by privacy-preserving solutions. to classify the computing tasks in , and give a detailed and comprehensive description of the current progress of each category. 2261 Journal of Electronics 2023 Document [6], in the section "Model Privacy Risk and Protection", divides privacy protection methods into differential privacy based and Based on cryptography 2 categories, cryptography technology mainly involves homomorphic encryption and secure multi-party computation. Literature[7] is based on differential privacy, homomorphic encryption and Secure multi-party computing technology introduces current privacy protection solutions in detail. In the section "Classification of Machine Learning Privacy Protection Solutions", the article follows model training In this way, privacy protection solutions are divided into three categories: centralized, distributed and federated learning. Literature [8] "Privacy Defense in Machine Learning" In the "Scheme" section, the privacy protection defense scheme is summarized according to five strategies based on interference, approximation, generalization, confrontation and localization, and the technologies involved are summarized. Including differential privacy, homomorphic encryption, L2 regularization, adversarial networks, security protocols, etc. Literature [9] divides privacy protection machine learning solutions into using Cryptographic method solutions and solutions using data perturbation methods Category 2: For solutions using cryptographic methods, further follow the steps of homomorphic encryption, Garbled The four technologies of circuit, secret sharing and security processor are sorted out to sort out the existing typical solutions; For solutions using the data perturbation method, further follow the differential Three technologies of privacy, local perturbation, and dimensionality reduction were used to sort out the existing typical solutions. Among them, differential privacy mainly involves adding noise, and local perturbation mainly < /span> computing models. this article points out that improving the efficiency and security of privacy protection algorithms is a research direction that is beneficial to all kinds of Privacy protection solutions under various computing models After sorting out, to conduct principle analysis and performance analysis to demonstrate the current technological progress. and the main algorithms adopted, and analyze some Adopt cryptographic tool algorithms it from the perspectives of technology, strategy, and application. Summary. However, the computing model has a direct impact on actual production activities. Are the assumptions under the computing model and the application scenarios of the actual artificial intelligence privacy protection program? Matching determines whether the scheme under this type of computing model can be actually deployed. Therefore, this article strives to analyze different privacy protection schemes from the perspective of the computing model. Organize the computing tasks and provide the main computing tasks completed by privacy protection solutions under various computing models at present artificial intelligence computing tasks, and classifies The current review mainly focuses on artificial intelligence privacy protection solutions and human To anonymously prove the identity of the user. For the gene detection computing task, the literature [12] divides the technologies involved in privacy-preserving computing into homomorphic encryption, Garbled Circuits, inadvertent transmission, private information extraction and encryption of several types of finite automata, and summarized the specific calculation functions involved in this type of computing tasks, including Edit distance, disease susceptibility, identity testing, ancestry testing, paternity testing, personal medical care and genetic matching, etc. and zero-knowledge proof 4 categories, among which zero-knowledge proof is mainly used in recommendation systems based on location Homomorphic encryption, secure multi-party computation, secret sharing described. For recommendation system computing tasks, the literature [11] divides the technologies involved in privacy-preserving computing into For deep learning computing tasks, Literature [10] divides the technologies involved in privacy protection computing into three categories: homomorphic encryption, secure multi-party computation and differential privacy, and analyzes each of them. The privacy-preserving deep learning scheme in the category is . Feedback random response, dimensionality reduction mainly transforms the data source to reduce the dimension of the data This section sorts out the computing tasks and typical tasks involved in the current artificial intelligence privacy protection solutions according to 4 types of computing models. Algorithm based on cryptographic tools. Table 1 summarizes the various symbols used in this section and their meanings. 3. 1 Multi-center model The main content of the multi-center model is to use more than two service providers to simulate the function of secure multi-party computation TTP to achieve privacy The purpose of protecting computing is that the current privacy protection scheme uses up to three service providers , so the definition of the three-center model is given below, and the general definition can












































































Expand similarly.
Definition 1 (Three-center model) The participants of the three-center model are denoted as
P1 P2 Pn, The service provider is recorded as S0 S1 S2. For 1 ≤ i ≤ n and
0 ≤ j ≤ 2, participant Pi uploads the secret share xij to the service provider Sj. In 0 ≤ j ≤ 2 > In the three-center model, participants share themselves to Pi. calculation function (y1 y2 yn ) = F(x1 x2  xn ), and return the output yi
On top of the participants’ secret shares, 3 service providers cooperate to complete a certain 3 service providers complete privacy protection on the basis of secret shares Calculation. The three service providers obviously cannot all be controlled by the attacker. This means that in actual applications, participants need to trust t service providers will not leak private data and is credible, where t is the participant’s choice Table 1 Symbols used in this article and their meanings Symbol [P1 P2 Pn ] [S1 S2 Sm ] xi OT x±j< /span> Garbled line key corresponding to bit y in the circuit.< /span> a> is an arbitrary access structure [14] +sxi1 sx'i2 + sxi2 sx'i1 + sxi1 sx'i1 = sxi0 sx'i1 + sxi1 sx'i0 + sxi0 sx'i0 xi x'i = (sxi0 + sxi1 + sxi2 ) (sx'i0 + sx'i1 + sx'i2 ) completes the (2, 3) threshold xi × Calculation of the share of the product of x'i. After the amount of perturbation, the service provider Sj sends it to Sj', j'= j + 1 mod 3, satisfies xi × x'i = spi0 + spi1 + spi2. These three shares pass through 0's (3, 3) shares obtained a (3, 3) about xi × x'i The share of product spij, 0 ≤ j ≤ 2, provider Sj can be completed alone without communication. Therefore, the three parties actually each share of the product of xi × x'i, the formula (3-1) can be listed, in which the calculation of the jth line can be Service, assuming that it is necessary to calculate the secret The multiplication operation is related to the secret sharing scheme. When the secret sharing scheme The share of service providers among participants On, addition and multiplication calculations can be completed. Among them, operations such as the share of the sum of two secret values, the share of the product of a public constant and the secret value can be completed. It can be completed locally without communication. shares based on additive secrets. 3 shares are given to 3 service providers respectively, forming a (3, 3) threshold access structure. Literature [15] gives the Sharemind framework. Participants divide their secrets into can recover the secret, so it is a (2 , 3) Threshold access structure. According to the sharing process of Algorithm 1, it is obvious that any two service providers bits. Algorithm 1 gives the sharing process. The output of Algorithm 1 xij is uploaded to Sj , 0 ≤ j ≤ 2.. Set The private data xi mod 2k of participant Pi is an integer of k with the secret sharing scheme of arbitrary access structure [14] Participants share their own private data In a general sense, the calculation of the service provider in the multi-center model relies on the secret sharing scheme. For example, the literature [13] gives the ABY3 model, see Issue 8 Tian Haibo: Overview: The threshold value of secret sharing technology used in artificial intelligence privacy protection computing model based on cryptographic technology 2262 The shared key of participants Pi and Pu G =< g > is a cyclic group, q is the order of the group Random permutation function The ciphertext of data xi The 0th bit of w Line key k y Service provider Sj's secret share regarding the sign of x Service provider Sj with respect to x 'i - xi difference secret share Service provider Sj with respect to xi > x'i's secret share Random number The l-th bit share of r, to service provider Sj Secret share of xi, given to Sj or Pj, threshold is t Integer z shifted right by k bits xi × Additive secret sharing share of x'i, to Sj or Pj Bit length xi's additive secret share, to Sj or Pj xi's secret share, to Sj or Pj Private data of participant Pi Represents a list containing m service providers Represents a list containing n participants Meaning skiu G g q π {xi } w ) LSB(k y w k y x-j x> j j r( )l ij x(t) z  k k PRF spij sxij xij
























































































+sxi2 sx'i0 + sxi0 sx'i2 + sxi2 sx'i2 (1)
When considering a k-bit integer as a fixed-point number, suppose that the small number containing d bits< /span> Output: x'ij = (sx'ij sx'ij' ) are suitable for multi-center models and are not limited to 3 service providers. For example, if a general access structure secret sharing scheme with (3,5) threshold is adopted, the calculation method of formula (1) still holds. In fact, when When the number of service providers is large, there is a more efficient method of calculating multiplication. When the secret sharing method is Shamir secret sharing, the literature [18] gives a pseudo-random number secret share generation algorithm using two gates. Algorithm 3 Input: Service provider Sj has xij = (sxij  sxij' ), j' = j + 1 mod 3, execution round number z It is particularly important to note that the algorithms shown in equations (1) and (2) is completed using Algorithm 4 in document [15].. Document [13] adopts this method. The multiplication pre-calculated share of set size, so that The XOR of all secret shares will eventually result in 0. For example, sx'i0 Åsx'i1, sx'i1 Åsx'i2 and sx'i2 Åsx'i0 are 0 the sum of the number of everyone’s secret shares must be a multiple of the maximum unqualified access structure The principle of Algorithm 3 is that, For secret sharing of any access structure, The secret share of number x'i = sx'i0 + sx'i1 + sx'i2 at service provider Sj. respectively. The shares xi0 xi1 xi2 are given. Literature [17] gives a generation algorithm for pseudo-random number secret shares, as shown in Algorithm 3. In this algorithm, x'ij is a new random number. structure, it is assumed that the service provider S0 S1 S2 has 0 secret share or a multiplicatively precomputed share. For the secret sharing scheme of any access As can be seen from the above algorithm, the multiplication operation on the secret share requires +b ( xi - a) + ( xi - a) ( x'i - b) (2) = ab + a ( x'i - b) xi x'i = ( xi +a - a) ( x'i +b - b) bj (xi -a ) + (xi -a)(x'i -b). provider resumes The values ​​of xi - a and x'i - b. From formula (2), it can be seen that at this time, the service provider Sj can locally calculate the share of xi x'i cj + aj (x' i -b) + service provider Sj only needs to calculate sxij - aj and sx'ij - bj, and then the service has the share sxij of the integer xi and the share sx' of the integer x'i ij. To calculate xi × x'i, Party Sj already has the share a, b, c aj bj cj. In addition, the service provider Sj amount, and then calculate according to formula (1). Literature [16] adopts the Beaver pre-calculation method. Assume that the integer c = ab, and the service provides When When the secret sharing scheme is additive secret sharing, the literature [15] adopts the method of first converting the additive sharing share into the share of any access structure is divided by 2d according to the above process to obtain the final share of the inner product. shares of each component locally according to equation (1) and do local addition. After summing, the (3, 3) share of the final inner product is obtained, and then calculate the (3, 3) requirement of the fixed-point number. When 2 vectors containing k-bit specific points are done When performing the inner product, you can also is spij , it needs to be scaled according to Algorithm 2 to meet the . Then we get the unscaled direct product of xi × x'i After the secret share (srj, srdj, srj', srdj '), j' = j + 1 mod 3. Let the (3, 3) share of 0 be szj at Sj srd1 + srd2. The service provider Sj has the secret share of these 2 random numbers. Assume two random numbers r = sr0 + sr1 + sr2 and r/2d = srd0 +
The number part, then the multiplication operation also needs to be divided by 2d to maintain the property of the fixed-point number










































  1. The service provider Sj calculates sx'ij =
    PRFsxij ( z ) and sx'ij' = PRFsxij' ( z ), j'= through a pseudo-random number generation function PRF j + 1 mod 3 .
  2. Output x'ij = (sx'ij sx'ij' ) .
    Algorithm 1 Participants share private data algorithm
    Input: Participation The k-bit private data xi mod 2k
    of Pi is output: the secret share xij,0 ≤ j ≤ 2 given to the service provider Sj
  3. Pi randomly selects sxi0 sxi1 mod 2k, and calculates sxi2 = xi - sxi0 - sxi1 mod 2k.
  4. Pi generates the secret share xi0 = (sxi0 sxi1 )xi1 = (sxi1 sxi2 )xi2 = (sxi2 sxi0 ) .
  5. Output xij,0 ≤ j ≤ 2 .
    Algorithm 2 Scaling algorithm for products in fixed-point multiplication of secret shares
    Input: Service provider Sj The secret share of (spij  szj  srj, srdj, srj' , srdj'), j'= j +
    1 mod 3,0 ≤ j ≤ 2
    Output: secret share of xi × x'i fixed-point product
  6. Service provider Sj public spij + szj - srj .
  7. All service providers recover together tmp = xi × x’i - r .
  8. The service provider Sj computes tmp/2d + srdj and tmp/2d + srdj’ .
  9. Output (tmp/2d + srdj tmp/2d + srdj') .
    2263
    Journal of Electronics 2023
    limit to calculate the multiplication method. Assume that the smaller threshold is t, the service provider Sj has a t threshold share of xi x(t) The t threshold share x'(t) of ij and x'i ij , and the t threshold share r(t) of the same random number r ) j and 2t threshold share r(2t) j . The calculation of the multiplicative share when is shown in Algorithm 5 . Algorithm 5 cannot be used in the three-center model because the threshold is too small, so the secret share is the secret value itself. However, for the more general model When the central model is used and the tolerance ratio is smaller than , this method only requires one secret recovery operation and consumes a pair of random shares to complete. A multiplication. Completing some complex operations on the basis of secret shares requires a relatively high amount of pass. Taking comparison operations as an example, assume that the service provider Sj Having the share xij of k bits xi under a certain secret sharing scheme and the share x'ij of x'i of the same length, the service provider now wants to get The share of xi > x'i - the share x-j of xi, and then according to Algorithm 6, the share of the sign of the difference x±j is obtained, and then the share of xi > x'i 1 - x±j is obtained. In order to calculate x±j, the service provider needs to precompute the shares of k binary random numbers, the shares of a random number α and the random number β Î { + 1 - 1} share. Assume that service provider Party Sj participates in pre-calculation and obtains r(1) j r(2) a> entities have an important impact on the technical choices of privacy protection computation. Starting from the trustness Abstract: The application scenarios of artificial intelligence privacy protection are diverse. In different scenarios, the trustness and number of entities fulfilling privacy protection computation are different. The trustness and number of these(School of Computer Science and Engineering, Sun Yat-Sen University, Guangzhou, Guangdong 510275, China) TIAN Hai- bo, LIANG Xiu-qi Based on Cryptographic Techniques A Survey: Computing Models of Artificial Intelligence Privacy Protection Electronic Journal URL: http://www.ejournal.org.cn DOI :10.12263/DZXB.20210702 Middle picture Classification number: TP309.2 Document identification code: A Article number: 0372-2112(2023)08-2260-17 (No.2020B010166005); Huawei Technologies Co., Ltd. (No.TC20210407007, No.YBN2019105017) Fund project: Guangdong Province Basic and Applied Basic Research Major Project (No. .2019B030302008); Guangdong Province Key Areas R&D Plan Project Keywords: artificial intelligence; privacy protection; computing model; algorithm; protocol; cryptography technology model. It gives the typical calculations involved in the artificial intelligence privacy protection method and the typical algorithms adopted, and points out that improving the efficiency and security of the algorithm is a research direction applicable to every computing Mental model and reality model. Except for the reality model, trusted entities exist in other computing models. For each computing model, this article gives the current cryptography-based engineering model quantity, the artificial intelligence privacy protection calculation method based on cryptographic technology is classified into four calculation models, namely multi-center model, dual-center model, and single center are not the same. The credibility and number of these entities have an important impact on whether the privacy-preserving calculation method can be applied in practice. This article starts from entities Based on the degree of credibility and secret sharing scheme. Under the semi-honest model, there are various application scenarios for artificial intelligence privacy protection based on secret shares. In different scenarios , the credibility and number of entities that complete privacy-preserving calculations limits the number of service providers. The number is three, and it is agreed to use any access structure Garbled circuit technology with fewer communications times can be used. In the literature [13], If it can To limit the number of service providers in the multi-center model, log k level protocol, this protocol is more efficient. O ( )k Legendre symbol Round, the communication volume is O(log k) level, the communication volume is O(k) level. Literature [20] obtained the communication number constant by using communication, considering concurrent operations, the number of communication rounds of the above comparison protocol is one consecutive multiplication and k multiplications. Because one multiplication requires at least one Without considering the cost of precomputation, Algorithm 6 requires 2 times Secret recovery, j , αj and βj. j r( )k




















































    and number of entities, this paper classifies the computation methods of artificial intelligence privacy protection, which are
    based on cryptographic techniques into four types of computation models: multiple centers model, double centers model,
    single center model and real model. Except for the real model, there are trusted entities in all other computation models.
    For each kind of computation model, this paper presents the typical computations and algorithms, which are involved in the
    current artificial intelligence privacy protection methods based on cryptography tools. And this paper also points out that
    improving the efficiency and security of algorithms is an applicable research direction for each model.
    Key words: artificial intelligence; privacy protection; computation models; algorithms; protocols; cryptographic
    techniques
    Foundation Item(s): Guangdong Major Project of Basic and Applied Basic Research (No.2019B030302008); Key-
    Area Research and Development Program of Guangdong Province (No.2020B010166005); Huawei Technologies Co., Ltd.
    (No.TC20210407007, No.YBN2019105017)
    1 引言
    人工智能隐私保护具有现实需求 . 人工智能计算
    的应用场景非常丰富 . 例如银行可以通过用户的数据
    建立模型,然后用该模型对用户的信用进行评估 . 银行
    和电信公司可以通过两方的数据建立更好的模型,从
    而对用户信用进行更为准确的评估 . 多家医院可以通
    过各自用户的数据建立关于疾病的模型,以更好地完
    成诊疗服务 . 在不同的应用场景中,对模型、数据隐私
    保护的需求有所不同 . 模型隐私保护的需求往往来自
    其商业价值,因为一个预测准确率高的模型往往是经
    过大量的资金和人力投入获得的,并且可以作为一种
    服务[1]
    提供给消费者或者合作伙伴 . 数据隐私保护的
    收稿日期:2021-06-04;修回日期:2022-07-25;责任编辑:王天慧
    第 8 期 田海博:综述:基于密码技术的人工智能隐私保护计算模型
    需求实际来自各种法律法规 . 我国在《中华人民共和国
    民法典》和《中华人民共和国网络安全法》中明确了个
    人隐私数据和网络运营者使用这些数据时应当遵循的
    法律条文 . 这迫使数据提供方对数据附有各种使用策
    略[2]
    ,以满足法律法规的要求 .
    人工智能隐私保护的计算主要指在保证模型、数
    据隐私的条件下,完成模型的训练或者在已有模型上
    完成预测 . 完成隐私保护计算的实体数量和可信程度
    是不同的 . 例如银行和电信公司是两个参与方,为了进
    行隐私保护的计算,一种方法是引入一个隐私计算服
    务提供方和一个密钥分配服务提供方 . 假设密钥分配
    和隐私计算两个服务提供方不勾结,可以实现基于同
    态加密等技术的双参与方、双服务提供方的隐私保护
    计算 . 又例如多家医院形成多个参与方,为了进行隐私
    保护的计算,一种方法是引入一个联邦平均算法服务
    提供方 . 假设该服务提供方是半诚实的,可以实现多参
    与方、单服务提供方且能保护用户数据隐私的横向联
    邦学习 .
    多个参与方完成的计算形成了多方计算的场景 .
    密码学领域中的安全多方计算[3]
    是一种自然的技术
    选 择 . 安 全 多 方 计 算 使 n 个 互 不 信 任 的 参 与 方
    [P1 P2 Pn ] 能 够 完 成 某 个 函 数 F 的 计 算 任 务
    ( y1 y2 yn ) = F(x1 x2 xn ),其中 xi 是参与方 Pi 的隐
    私数据,1 ≤ i ≤ n,F 代表任意可以在图灵机模型下计算
    的函数,(y1 y2 yn ) 是该函数的输出结果[4]
    . 在计算
    任务完成后,参与方 Pi 能获得输出 yi,同时不泄露各自
    的隐私数据 xi. 多个参与方通过安全多方计算可以在
    保护数据、模型隐私的条件下,完成函数 F 为训练或者
    预测的计算 .
    Secure multi-party computation defines an ideal model to clarify the connotation of security
    [4]
    . In the ideal model, there is a credible first Three-party TTP (Trusted
    Third Party), each participant securely transmits private data to the
    TTP, and the TTP runs function F to obtain the output (y1 y2 yn ), and then send the
    output yi safely to the participant Pi, 1 ≤ i ≤ n. It is difficult to find
    like this in reality TTP, so the main task of secure multi-party computation is to provide a method to simulate
    TTP function. Secure multi-party computation starts from Mr. Yao's millionaire
    problem and goes through After 40 years of development, obfuscated circuits,
    inadvertent transmission, GMW (Goldreich, Micali, Wigderson) protocols,
    BGW (Ben-Or, Goldwasser) have been formed. , Wigderson) protocol, selection-segmentation protocol, zero-knowledge proof, etc. [5] . These technologies Under the condition of semi-honest attacker or malicious attacker, each participant can directly simulate the function of TTP and obtain the corresponding output. In terms of artificial intelligence privacy protection, there is still a big gap between the theory of secure multi-party computing and actual needs. In artificial intelligence privacy protection, the data that needs to be protected The scale often exceeds 10,000, and the number of model parameters that need to be protected often exceeds one million. When we run a good secure multi-party computing protocol on small-scale data When placed in the actual environment of artificial intelligence privacy protection, the communication or calculation load of these protocols is too large, resulting in the performance of the protocols often not meeting the actual requirements. Requirements. We have noticed that the current artificial intelligence privacy protection calculation actually adopts a compromise idea, that is, the service provider is introduced and the service provider is Make certain confidence assumptions to arrive at solutions that meet actual needs. In these solutions the number and credibility of service providers are are different. However, its tasks are similar. Borrowing the concept of secure multi-party computation, it can be said that the service provider simulates the function of TTP to provide participants with A computation in which one party provides privacy protection. For example, a privacy protection computation between two parties and two service providers can be regarded as two parties that do not trust each other. Participants, through the TTP completion function F simulated by two service providers, is the calculation process of some artificial intelligence algorithm.< a i=32> Based on the above facts, we believe that the current artificial intelligence privacy protection scheme can be divided into several categories according to the number and credibility of service providers. Divide it into several computing models, and then sort out the typical calculations performed under different computing models and the typical algorithms based on cryptography tools when completing computing tasks, which can allow Based on actual needs, readers can quickly understand the current progress of certain types of computing models and the main cryptographic technologies used, and prepare for further research or applications. Specifically, we classify the calculation of current artificial intelligence privacy protection into four categories: multi-center, dual-center, single-center model and realistic model. Now< /span> The confidence assumptions of these service providers include non-collusion, trustworthiness, semi /span> In the section "Model Privacy Risk and Protection", the literature [6] divides privacy protection methods into Differential privacy and cryptography-based 2 categories, among which cryptography technology mainly involves homomorphic encryption and secure multi-party computation. Literature [7] According to differential privacy, homogeneous Dynamic encryption and secure multi-party computation technology introduces the current privacy protection schemes in detail. In the section "Classification of Machine Learning Privacy Protection Schemes", the article According to the model training method, privacy protection solutions are divided into three categories: centralized, distributed and federated learning. Literature [8] in "Machine Learning" "Privacy Defense Schemes" section summarizes the privacy protection defense schemes based on five strategies: perturbation, approximation, generalization, adversarial and local. The technologies involved include differential privacy, homomorphic encryption, L2 regularization, adversarial networks, security protocols, etc. Literature [9] puts privacy protection into machine learning solutions Divided into two categories: solutions using cryptography methods and solutions using data perturbation methods: solutions using cryptography methods Journal of Electronics 2023 2261 At present, there have been many high-quality reviews on privacy-preserving computing in artificial intelligence. Most of these reviews focus on different technologies and strategies adopted by privacy-preserving solutions. or computing tasks in application scenarios, and provides a detailed and comprehensive description of the current progress of each category. 2 Problems and Challenges whether it can actually be implemented and whether it can complete the task of privacy protection. assumptions are not easy to satisfy, such as a trusted key distribution center. Therefore, the computing model to which the privacy protection scheme belongs can reflect its In some scenarios realize how many of the multiple service providers are trustworthy. One assumption. Some. Depending on the user's trust in the company, we can can be satisfied. For example, multiple competing companies Jointly provide certain services to users will not leak user data, etc. In specific application scenarios, some assumptions service providers; how many of the multiple service providers can be trusted? Trustworthy, Honesty, etc. For example, the key distribution center is generally trustworthy and will not collude with other F. (3) Single Central model: There is a service provider S that completes the calculation of function (2) Dual-center model: There are two service providers S1 and S2, which work together to complete the calculation of the F function. m > 2, cooperating to complete the calculation of the F function. (1) Multi-center model: There are multiple service providers [S1 S2 Sm], informal descriptions of computational models are as follows. There are no service providers and their confidence assumptions in the real model. Other types of




































































    scheme, the existing typical schemes are further sorted out according to four technologies: homomorphic encryption, Garbled circuit, secret sharing
    and security processor;
    For solutions using data perturbation methods, the existing typical solutions were further sorted out according to three technologies: differential privacy, local perturbation, and dimensionality reduction. . Among them, differential privacy mainly involves adding noise, local disturbance is mainly feedback random response, and dimensionality reduction mainly involves transforming the data source to reduce the dimension of data .< /span> >. For recommendation system computing tasks, the literature [11] considers privacy-preserving schemes< /span> and zero-knowledge proof. Among them, zero-knowledge proof is mainly based on location< In the recommendation system of /span> a> a> a> This section sorts out the current artificial intelligence implications according to four types of computing models 3 Analysis of research status A research direction that is beneficial to both computing models. this article points out that improving the efficiency and security of privacy protection algorithms is the key to various types of After sorting out privacy protection solutions under various computing models, to demonstrate the current technological progress. and the main algorithms used, and conduct principle analysis and performance analysis of some algorithms using cryptography tools degree, sort out different privacy protection schemes and computing tasks, and provide the main computing tasks completed by privacy protection schemes under various current computing models solution under this type of computing model can be actually deployed. Therefore, this article strives to analyze it from the perspective of the computing model Whether the application scenarios of the solution match or not determines whether the impact on actual production activities. Assumptions under the computing model and actual artificial intelligence privacy protection . However, the computing model has a direct artificial intelligence computing tasks, from technology, strategy, application Classified and summarized from the perspective of The current review mainly focuses on artificial intelligence privacy protection solutions and human Wait. The specific calculation functions involved include edit distance, disease susceptibility, identity testing, ancestry testing, paternity testing, personal medical treatment and genetic matching and finite automata for encryption, and summarizes the computing tasks of this type is divided into several categories: homomorphic encryption, Garbled circuit, inadvertent transmission, private information extraction detection computing task, the literature [12] lists the technologies involved in privacy-preserving computing, it is used to anonymously prove the identity of the user. For the gene The technologies involved in calculations are divided into four categories: homomorphic encryption, secure multi-party computation, secret sharing private 3 categories, and the privacy-preserving deep learning scheme in each category is described into homomorphic encryption, secure multi-party computation and differential concealment For deep learning computing tasks, the literature [10] divides the technologies involved in privacy-preserving computing


































  10. 1 Multi-center model
    The main content of the multi-center model is to use more than two service providers to protect artificial intelligence privacy in various application scenarios. In different scenarios, complete The credibility and number of entities in privacy-preserving calculations
    are not the same. The credibility and number of these entities have an important impact on whether the privacy-preserving calculation method can be applied in practice. This article starts from the credibility of entities Based on the degree and
    quantity, the artificial intelligence privacy protection calculation method based on cryptographic technology is classified into four calculation models, namely multi-center model, dual-center model, and single-center model< a i=4> mental model and reality model. Except for the reality model, there are trusted entities in other computing models. For each computing model, this article gives the current cryptography-based tools He also pointed out that improving the efficiency and security of algorithms is a research direction applicable to every computing model.. In different application scenarios, the model and data privacy their respective user data to better Complete diagnosis and treatment services to conduct a more accurate assessment of user credit. Multiple hospitals can build models about diseases through and telecommunications companies can build better models through the data of both parties. , from builds a model, and then uses the model to evaluate the user's credit. Banks has many application scenarios. For example, banks can use The user's data Artificial intelligence privacy protection has real needs. Artificial intelligence computing 1 Introduction (No.TC20210407007, No.YBN2019105017) Area Research and Development Program of Guangdong Province (No.2020B010166005); Huawei Technologies Co., Ltd. Foundation Item(s): Guangdong Major Project of Basic and Applied Basic Research (No.2019B030302008); Key- Key words: artificial intelligence; privacy protection; computation models; algorithms; protocols; cryptographic techniques improving the efficiency and security of algorithms is an applicable research direction for each model. current artificial intelligence privacy protection methods based on cryptography tools. And this paper also points out that For each kind of computation model, this paper presents the typical computations and algorithms, which are involved in the single center model and real model. Except for the real model, there are trusted entities in all other computation models. based on cryptographic techniques into four types of computation models: multiple centers model, double centers model, and number of entities, this paper classifies the computation methods of artificial intelligence privacy protection, which are Abstract: The application scenarios of artificial intelligence privacy protection are diverse. In different scenarios, the trustness and number of entities fulfilling privacy protection computation are different. The trustness and number of these entities have an important impact on the technical choices of privacy protection computation. Starting from the trustness(School of Computer Science and Engineering, Sun Yat-Sen University, Guangzhou, Guangdong 510275, China) TIAN Hai-bo, LIANG Xiu-qi Based on Cryptographic Techniques A Survey: Computing Models of Artificial Intelligence Privacy Protection Journal of Electronic Journal URL: http://www.ejournal.org.cn DOI:10.12263/ DZXB.20210702 CLC classification number: TP309.2 Document identification code: A Article number: 0372-2112(2023)08-2260-17 (No.2020B010166005); Huawei Technologies Co., Ltd. (No.TC20210407007, No.YBN2019105017) Fund project: Guangdong Province Basic and Applied Basic Research Major Project (No. 2019B030302008) ;Guangdong Provincial Key Areas R&D Plan Project Keywords: artificial intelligence; privacy protection; computing model; algorithm; protocol; cryptography technology


































    The needs for protection are different. The need for model privacy protection often comes from
    its commercial value, because a model with high prediction accuracy is often
    Obtained through a large amount of capital and manpower investment, and can be provided as a
    service [1]
    to consumers or partners. Data privacy protection of
    Received date: 2021-06-04; Revised date: 2022-07-25; Editor in charge: Wang Tianhui
    Issue 8 Tian Haibo: Summary :Artificial intelligence privacy protection computing model based on cryptography technology
    The demand actually comes from various laws and regulations. China's "Civil Code of the People's Republic of China
    " and "People's Republic of China" The Cybersecurity Law of the Republic clarifies personal privacy data and the legal provisions that network operators should follow when using this data. This forces data providers to The data is accompanied by various usage policies[2] to meet the requirements of laws and regulations. Artificial Intelligence Privacy Protection The calculation mainly refers to completing the training of the model or completing the prediction on the existing model under the conditions of ensuring the privacy of the model and data. Completing the privacy protection calculation The number of entities and degree of trustworthiness are different. For example, banks and telecommunications companies are two participants. In order to carry out privacy protection calculations, a The method is to introduce a privacy computing service provider and a key distribution service provider. Assume that the key distribution and privacy computing service providers do not Collusion can realize the privacy protection of dual participants and dual service providers based on homomorphic encryption and other technologies. Another example is that multiple hospitals form multiple Participants, in order to perform privacy-preserving calculations, one approach is to introduce a federated average algorithm service provider. It is assumed that the service provider is semi-honest , can realize horizontal connection of multiple parameters and single service provider and can protect user data privacy State learning. Computations completed by multiple participants form a multi-party computation scenario. Secure multi-party computation in the field of cryptography [3] is a natural technology Choice. Secure multi-party computation enables n parties that do not trust each other [P1 P2 Pn] to complete the calculation task of a certain function F ( y1 y2 yn ) = F(x1 x2 xn ), where xi is the private data of participant Pi , 1 ≤ i ≤ n, F represents any function that can be calculated under the Turing machine model, (y1 y2 yn) is the output result of the function [4] . After the calculation task is completed, the participant Pi can obtain the output yi without leaking their own private data xi. Multiple participants Through secure multi-party computation, parties can complete the calculation of function F for training or prediction under the condition of protecting the privacy of data and models. Secure multi-party computation defines an ideal model to clarify the connotation of security [4]. In the ideal model, there is a trusted third party TTP (Trusted Third Party), each participant securely transmits private data to the TTP, and the TTP runs function F to obtain the output (y1 y2 yn ), and then send the output yi securely to the participant Pi, 1 ≤ i ≤ n. It is difficult to find such a TTP in reality , so the main task of secure multi-party computation is to provide a method to simulate TTP function. Secure multi-party computation started from Mr. Yao's millionaire problem and after 40 years The development of obfuscated circuits, inadvertent transmission, GMW (Goldreich, Micali, Wigderson) protocol, BGW (Ben-Or, Goldwasser, Wigderson) ) protocol, selection-segmentation protocol, zero-knowledge proof, etc. [5] . These technologies Under the condition of semi-honest attacker or malicious attacker, each participant can directly simulate the function of TTP and obtain the corresponding output. In artificial intelligence In terms of privacy protection, there is still a large gap between the theory of secure multi-party computation and the actual needs. In artificial intelligence privacy protection, the scale of data that needs to be protected is often exceeds 10,000, and the number of model parameters that need to be protected often exceeds one million. When we put a good secure multi-party computing protocol that runs well on small-scale data into In the actual environment of artificial intelligence privacy protection, the communication or calculation load of these protocols is too large, resulting in the performance of the protocols often not meeting actual needs. We have noticed that the current artificial intelligence privacy protection calculation actually adopts a compromise idea, that is, the service provider is introduced and certain requirements are imposed on the service provider. Confidence assumptions are made to arrive at solutions that meet actual needs. In these solutions the number and degree of trustworthiness of service providers vary . However, its tasks are similar. Borrowing the concept of secure multi-party computation, it can be said that the service provider simulates the function of TTP and provides participants with Privacy-preserving computation. For example, a privacy-preserving computation with two participants and two service providers can be regarded as two participants that do not trust each other. square, the TTP completion function F simulated by two service providers is the calculation process of a certain artificial intelligence algorithm. Based on the above facts, we believe that the current artificial intelligence privacy protection scheme can be divided into based on the number and credibility of service providers. Several computing models, and then sort out typical calculations performed under different computing models and typical algorithms based on cryptography tools when completing computing tasks, allowing readers to Based on actual needs, quickly understand the current progress of certain types of computing models and the main cryptographic technologies used, and prepare for further research or applications. a> Specifically, we attribute the calculation of current artificial intelligence privacy protection






































































    The categories are polycentric, dual-center, single-center model and realistic model. Currently
    there is no service provider and its confidence assumptions in the real model. Other categories a>
    The informal description of the computing model is as follows.
    (1) Polycentric model: There are multiple service providers [S1 S2 Sm], (2) Dual-center model: There are two service providers S1 and S2, together
    m > 2, cooperate to complete the calculation of the F function. (3) Single-center model: There is a service provider S, complete the calculation of the function F.< /span> meanings. This section is based on 4 types of calculations model, sorting out the computing tasks involved in current artificial intelligence privacy protection solutions and typical algorithms based on cryptography tools. Table 1 summarizes this section Various symbols used and their 3 Analysis of research status computing models. this article points out how to improve the efficiency and security of privacy protection algorithms. is a research direction that is beneficial to all types of After sorting out privacy protection solutions under various computing models, are carried out. To demonstrate the current technological progress. and the main algorithms adopted, and some principle analysis and performance analysis of some algorithms using cryptography tools scheme under this type of computing model can be practical. Deployment. Therefore, this article attempts to organize different privacy protection schemes and computing tasks from the perspective of computing models, and provides various current computing models. Below are the main computing tasks completed by the privacy protection scheme scheme determines whether the impact on actual production activities . Whether the assumptions under the computing model match the application scenarios of the actual artificial intelligence privacy protection classified and summarized from the perspectives of technology, strategy, and application. However, the computing model has a direct The computing tasks of artificial intelligence are The current review mainly focuses on artificial intelligence privacy protection solutions and human etc. private 3 categories, and the privacy-preserving deep learning schemes in each category are described. For recommendations System computing tasks, the literature [11] divides the technologies involved in privacy protection computing into homomorphic encryption, secure multi-party computation, secret sharing and zero-knowledge proof Category 4, in which zero-knowledge proof is mainly used in location-based recommendation systems to anonymously prove the identity of users. For gene detection calculation tasks, Literature [12] divides the technologies involved in privacy-preserving computing into homomorphic encryption, Garbled circuits, inadvertent transmission, private information extraction and encryption limited automatic Several categories of computers were identified, and the specific calculation functions involved in this type of computing tasks were summarized, including edit distance, disease susceptibility, identity testing, and family testing. , paternity testing, personal medical care and genetic matching into Homomorphic encryption, secure multi-party computation and differential hidden For deep learning computing tasks, the literature [10] divides the technologies involved in privacy protection computing dimensions. feedback random response, and dimensionality reduction mainly involves transforming the data source to reduce the data sorted out. Among them, differential privacy mainly involves adding noise, local disturbance is mainly privacy, local perturbation, and dimensionality reduction. The plan was For solutions using data perturbation methods, existing typical solutions are further analyzed based on three technologies: differential and security processors. Typical solutions have been sorted out;, further analysis of existing solutions is based on four technologies: homomorphic encryption, Garbled circuits, secret sharing regularization, and adversarial networks , security protocols, etc. Literature [9] divides privacy protection machine learning schemes into schemes using cryptography methods and schemes using data perturbation methods 2 Category: For solutions using cryptographic methods defense solutions, which involve technologies including differential privacy, homomorphic encryption, L2 In the section "Privacy Defense Schemes in Machine Learning", according to the method based on perturbation, approximation and generalization , adversarial and local five strategies summarize privacy protection Schemes", the article divides privacy protection schemes into centralized and distributed according to the model training method. and federated learning 3 categories. Literature [8] based on differential privacy, homomorphic encryption and secure multi-party computation technology. In the section "Classification of Machine Learning Privacy Protection introduces the current privacy protection scheme in detail Literature [6] in "Model Privacy "Risk and Protection" section divides privacy protection methods into two categories: differential privacy-based and cryptography-based. Among them, cryptography technology mainly involves homomorphism. Encryption and secure multi-party computation. Literature [7] Journal of Electronics 2023 2261 reviews of privacy-preserving computing in artificial intelligence. . Most of these reviews are classified by different technologies and strategies adopted by privacy protection solutions or computing tasks in application scenarios, and the current status of each category is summarized. A detailed and comprehensive description of the progress is given. At present, there have been many high-quality 2 Problems and Challenges assumptions are not easily met, such as trusted key distribution Center. Therefore, the computing model of a privacy protection solution can reflect whether it can be implemented in certain scenarios and whether it can complete privacy protection. Task. implemented. Some with certain services. Based on the user's trust in the company, The assumption that some of multiple service providers are trusted can be can be satisfied. For example, multiple competing companies jointly provide users will not leak user data, etc. In specific application scenarios, Some assumptions Collusion among service providers; some of the multiple service providers are trustworthy, The confidence assumptions of these service providers include non-collusion, trustworthiness, semi-honesty, etc. For example, the key distribution center is generally trustworthy and will not interact with other







































































  11. 1 Multi-center model
    The main content of the multi-center model is to use more than two service providers
    to simulate the function of secure multi-party computation TTP to To achieve the purpose of privacy-preserving computing
    . The current privacy protection scheme uses up to three service providers
    , so the definition of the three-center model is given below, in general The definition can be
    similarly extended.
    Definition 1 (Three-center model) The participants of the three-center model are denoted as
    P1  P2 Pn, the service provider is recorded as S0 S1 S2. For 1 ≤ i ≤ n and
    0 ≤ j ≤ 2, the participant Pi uploads the secret share to the service provider Sj xij. Based on the secret shares of
    participants, 3 service providers cooperate to complete a certain
    calculation function (y1 y2 yn ) = F(x1 x2 xn ), and return the output yi
    to Pi.
    In the three-center model, participants use secret sharing technology to Upload your own
    data to three service providers in the form of secret shares, and then
    the three service providers complete privacy protection based on the secret shares.
    Calculation. The three service providers obviously cannot all be controlled by the attacker. This means that in actual applications, participants need to trust t services The provider will not leak private data and is trustworthy, where t is the participant's choice Table 1 Symbols used in this article and their meaning Shared key of participants Pi and Pu< /span> is completed (2, 3) Calculation of the share of the product of threshold xi × x'i. After the amount of perturbation, the service provider Sj sends it to Sj', j'= j + 1 mod 3, satisfies xi × ) copies provider Sj alone without communication. Therefore, the three parties actually obtained a (3, 3) about The share spij of the product of xi × x'i, 0 ≤ j ≤ 2, share of the product of xi × x'i needs to be calculated, the formula (3-1) can be listed, where the first The calculation of row j can be completed by the service When it is assumed that the secret is any access Structure [14] The multiplication operation is related to the secret sharing scheme. When the secret sharing scheme multiplication calculations can be completed. Among them, the share of the sum of 2 secret values, the public constant and the secret value Operations such as the share of the product can be completed locally without communication. Service providers On the participant's share, addition and Divide your secret into 3 shares and provide them to 3 service providers respectively, forming a (3, 3) threshold access structure. Literature [15] gives the Sharemind framework, and participants share according to the additive secret can recover the secret, Therefore, it is a (2, 3) threshold access structure. According to the sharing process of Algorithm 1, it is obvious that any two service providers xij is uploaded to Sj, 0 ≤ j ≤ 2. bits. Algorithm 1 gives the sharing process, and the output of Algorithm 1 to share their own secrets. Private data is shared. Assume that the private data xi mod 2k of participant Pi is an integer of k Generally speaking, the calculation of the service provider in the multi-center model relies on the secret sharing scheme. For example, the literature [13] gives ABY3 model, participants use the secret sharing scheme of arbitrary access structure [14] The threshold value. Issue 8 Tian Haibo: Overview: Secret sharing technology used in artificial intelligence privacy protection computing model based on cryptography technology 2262 G =< g > is a cyclic group, q is the order of the group Random permutation function The ciphertext of data xi The 0th bit of w Line key k y The corresponding bit y in the Garbled circuit Line key The secret share of the service provider Sj with respect to the sign of x The secret share of the service provider Sj with respect to the difference between x'i - xi The secret share of the service provider Sj regarding xi > x'i ij The secret share of xi is given to Sj or Pj, and the threshold is t The integer z is shifted right by k bits w ) xi's additive secret share, given to Sj or Pj xi's secret share, given to Sj or Pj Private data of participant Pi Represents a list containing m service providers Represents a list containing n participants Meaning skiu G g q π {xi } LSB(k y w k y x±j x-j x> j j r( )l x(t) z  k k OT PRF spij sxij xij xi [S1 S2 Sm ] [P1 P2 Pn ] Symbols















































































    xi x'i = (sxi0 + sxi1 + sxi2 ) (sx'i0 + sx'i1 + sx'i2 )
    = sxi0 sx'i1 + sxi1 sx'i0 + sxi0 sx'i0
    +sxi1 sx'i2 + sxi2 sx'i1 + sxi1 sx'i1
    +sxi2 sx'i0 + sxi0 sx'i2 + sxi2 sx 'i2 (1)
    When the k-bit integer is regarded as a fixed-point number, assuming that it contains a small d-bit number part, the multiplication operation also needs to be divided by 2d, to maintain the property of fixed-point numbers. Suppose two random numbers r = sr0 + sr1 + sr2 and r/2d = srd0 + srd1 + srd2 . The service provider Sj has the secret share of these 2 random numbers (srj, srdj, srj', srdj'), j' = j + 1 mod 3. Let (3, 3) The share is szj at Sj. Then after obtaining the unscaled secret share of xi × x'i directly multiplied by spij, it is also necessary to Algorithm 2 is scaled to meet the requirement of for fixed-point numbers. When two vectors containing k-bit specific points are used as inner products, it can also be according to the formula The method of (1) first locally calculates the (3, 3) share of each component, and then does the local sum to obtain the (3, 3) share of the final inner product, and then When the secret sharing scheme is additive secret sharing, the literature [15] adopts +b ( xi - a) + ( xi - a) ( x'i - b) (2) a secret share of 0 or a multiplication precomputed share. For any access Input: Service provider Sj has xij = (sxij sxij' ), j' = j + 1 mod 3, execution round number z< /span> Output:x'ij = (sx'ij sx'ij' ) There is a more efficient method of calculating multiplication. When the secret sharing method is Shamir secret sharing, the literature [18] gives the use of two gates. > Algorithm 3 Pseudo-random number secret share generation algorithm calculation method of Equation (1) still holds. In fact, when the number of service providers is large, In particular, it should be noted that the formula (1) and The algorithm shown in equation (2) is suitable for multi-center models and is not limited to 3 service providers. For example, the threshold (3, 5) is used For the general access structure secret sharing scheme, the share is calculated using Algorithm 4 in the literature [15].. The literature [13] adopts this method method. The multiplication pre The sum of the number of everyone's secret shares must be a multiple of the size of the largest unqualified access structure set, so the XOR of all secret shares will eventually lead to The occurrence of 0. For example, sx'i0 Åsx'i1, sx'i1 Åsx'i2 and sx'i2 Åsx'i0 are the three secret shares of 0 The principle of Algorithm 3 is to share the secret of any access structure, shares xi0 xi1 xi2 respectively, the literature [17] gives a pseudo-random number The secret share generation algorithm is shown in Algorithm 3. In this algorithm, x'ij is a new random number x'i = sx'i0 + The secret share of sx'i1 + sx'i2 at the service provider Sj. secret sharing scheme, assuming that service providers S0 S1 S2 own As can be seen from the above algorithm, the multiplication operation on the secret share requires = ab + a ( x'i - b) xi x'i = ( xi +a - a) ( x'i +b - b) bj (xi - a ) + (xi -a)(x'i -b). The service provider Sj only needs to calculate sxij - aj and sx'ij - bj, and then the service provider recovers the values ​​​​of xi - a and x'i - b. It can be seen from Equation (2) , at this time the service provider Sj can locally calculate the share of xi x'i cj + aj (x'i -b) + has the share sxij of the integer xi and the share sx'ij of the integer x'i. To calculate xi × x'i, is then calculated according to formula (1). Document [16] The Beaver pre-calculation method is adopted. Assume that the integer c = ab, and the service provider Sj already has the shares a, b, c aj bj cj. In addition, the service provider Sj The method of first converting the additive sharing share into the share of any access structure and then divide by 2d according to the above process to obtain the final share of the inner product.











































  12. The service provider Sj calculates sx'ij =
    PRFsxij ( z ) and sx'ij' = PRFsxij' ( z ), j'= through a pseudo-random number generation function PRF j + 1 mod 3 .
  13. Output x'ij = (sx'ij sx'ij' ) .
    Algorithm 1 Participants share private data algorithm
    Input: Participation The k-bit private data xi mod 2k
    of Pi is output: the secret share xij,0 ≤ j ≤ 2 given to the service provider Sj
  14. Pi randomly selects sxi0 sxi1 mod 2k, and calculates sxi2 = xi - sxi0 - sxi1 mod 2k.
  15. Pi generates the secret share xi0 = (sxi0 sxi1 )xi1 = (sxi1 sxi2 )xi2 = (sxi2 sxi0 ) .
  16. Output xij,0 ≤ j ≤ 2 .
    Algorithm 2 Scaling algorithm for products in fixed-point multiplication of secret shares
    Input: Service provider Sj The secret share of (spij  szj  srj, srdj, srj' , srdj'), j'= j +
    1 mod 3,0 ≤ j ≤ 2
    Output: secret share of xi × x'i fixed-point product
  17. Service provider Sj public spij + szj - srj .
  18. All service providers recover together tmp = xi × x’i - r .
  19. The service provider Sj computes tmp/2d + srdj and tmp/2d + srdj’ .
  20. Output (tmp/2d + srdj tmp/2d + srdj') .
    2263
    Journal of Electronics 2023
    limit to calculate the multiplication method. Assume that the smaller threshold is t, the service provider Sj has a t threshold share of xi x(t) The t threshold share x'(t) of ij and x'i ij , and the t threshold share r(t) of the same random number r ) j and 2t threshold share r(2t) j . The calculation of the multiplicative share when is shown in Algorithm 5 . Algorithm 5 cannot be used in the three-center model because the threshold is too small, so the secret share is the secret value itself. However, for the more general model When the central model is used and the tolerance ratio is smaller than , this method only needs one secret recovery operation and consumes a pair of random shares to complete. A multiplication. Completing some complex operations on the basis of secret shares requires a relatively high amount of pass. Taking comparison operations as an example, assume that the service provider Sj Having the share xij of k bits xi under a certain secret sharing scheme and the share x'ij of x'i of the same length, the service provider now wants to get The share of xi > x'i - the share x-j of xi, and then according to Algorithm 6, the share of the sign of the difference x±j is obtained, and then the share of xi > x'i 1 - x±j is obtained. In order to calculate x±j, the service provider needs to precompute the shares of k binary random numbers, the shares of a random number α and the random number β Î { + 1 - 1} share. Assume that service provider Party Sj participates in pre-calculation and obtains r(1) j r(2) skiu< /span> a> The secret share of xi is given to Sj or Pj, and the threshold is t The integer z is shifted right by k bits The bit length of the data Inadvertent transmission Pseudo-random number generation function Additive secret sharing share of xi × x'i, to Sj or Pj xi’s additive secret share, given to Sj or Pj xi’s secret share, given to Sj or Pj Private data of participant Pi Represents a list containing m service providers Represents a list containing n participants Meaning G g q π {xi } w ) LSB(k y w k y x±j x-j x> j j r( )l ij x(t) z  k k OT PRF spij sxij xij xi [S1 S2 Sm ] [P1 P2 Pn ] Symbols Table 1 Symbols used in this article and their meanings will not leak private data and are trustworthy, where t is the method used by participants the three service providers complete the process on the basis of the secret shares Privacy-preserving computing. The three service providers obviously cannot all be controlled by attackers. This means that in actual applications, participants need to trust t service providers data to 3 service providers in the form of secret shares, and then In the three-center model, the participants pass the secret The sharing technology uploads one's to Pi. calculation function (y1 y2  yn ) = F(x1 x2 xn ), and return the output yi participant’s secret share, 3 service providers cooperate to complete a certain 0 ≤ j ≤ 2, the participant Pi pays the service provider Sj Upload the secret share xij. On top of the P1 P2 Pn, the service provider is recorded as S0 S1 S2. For 1 ≤ i ≤ n and Definition 1 (Three-center model) The participants of the three-center model are denoted as similarly extended. parties, so the definition of the three-center model is given below , the general definition can be . The current privacy protection scheme uses up to three service providers function to achieve the purpose of privacy-preserving computing secret sharing scheme. Under the semi-honest model, the secure multi-party calculation TTP can be simulated based on the secret share limits the number of service providers. The number is three, and it is agreed to use any access structure Garbled circuit technology with fewer communications times can be used. In the literature [13], If possible To limit the number of service providers in the multi-center model, log k level protocol, this protocol is more efficient. O ( )k Legendre symbol round, the communication volume is O(log k) level, the communication volume is O(k) level. Literature [20] obtained the communication number constant by using communication, considering concurrent operations, the number of communication rounds of the above comparison protocol is one consecutive multiplication and k multiplications. Because one multiplication requires at least one Without considering the cost of precomputation, Algorithm 6 requires 2 times Secret recovery, j , αj and βj. j r( )k


























































































    The l-th bit share of the random number r, to the service provider Sj
    The secret share of the service provider Sj regarding xi > x'i
    The secret share of the service provider Sj with respect to the difference between x'i - xi
    The secret share of the service provider Sj with respect to the sign of x
    The corresponding bit y in the Garbled circuit The line key
    The 0th bit of line key k y
    w
    The ciphertext of data xi< a i=8> Random permutation function G =< g > is a cyclic group, q is the order of the group Shared key of participants Pi and Pu 2262 Issue 8 Tian Haibo: Overview: Secret sharing of artificial intelligence privacy protection computing model based on cryptography technology The threshold value of technology. Generally speaking, the calculation of service providers in the multi-center model relies on the secret sharing scheme. For example, the literature [13] gives Based on the ABY3 model, participants use the secret sharing scheme of arbitrary access structure [14] to share their own secrets. > Private data is shared. Assume that the private data xi mod 2k of participant Pi is an integer of k bits. Algorithm 1 gives the sharing process, and the output of Algorithm 1 xij is uploaded to Sj, 0 ≤ j ≤ 2. According to the sharing process of Algorithm 1, it is obvious that any two service providers can recover the secret , so it is a (2, 3) threshold access structure. Literature [15] gives the Sharemind framework, and participants follow additive secrets Share divides your secret into 3 shares and provides them to 3 service providers respectively, forming a (3, 3) threshold access structure. services The provider can complete addition and multiplication calculations on the participant's share. The share of the sum of the two secret values, the public constant and the secret value Operations such as the share of the product can be completed locally without communication. The multiplication operation is related to the secret sharing scheme. When the secret sharing scheme is any access When entering the structure [14], assuming that it is necessary to calculate the secret share of the product of xi × x'i, the formula (3-1) can be listed, where The calculation of line j can be completed by the service provider Sj alone without communication. Therefore, the three parties actually obtained a (3, 3) for each of them. Regarding the share spij of the product of xi × x'i, 0 ≤ j ≤ 2, satisfies xi × 3) After the perturbation of , the service provider Sj sends it to Sj', j'= j + 1 mod 3, is completed (2 , 3) Calculation of the share of the product of threshold xi × x'i. xi x'i = (sxi0 + sxi1 + sxi2 ) (sx'i0 + sx'i1 + sx'i2 ) = sxi0 sx'i1 + sxi1 sx'i0 + sxi0 sx'i0 +sxi1 sx'i2 + sxi2 sx'i1 + sxi1 sx'i1< /span> The shares xi0 xi1 xi2 are obtained. The literature [17] gives the secret share of pseudo-random numbers structure, it is assumed that the service provider S0 S1 S2 respectively owns requires a secret share of 0 or a multiplicatively precomputed share. For the secret sharing scheme of any access As can be seen from the above algorithm, the multiplication operation on the secret share requires +b ( xi - a) + ( xi - a) ( x'i - b) (2) = ab + a ( x'i - b) xi x'i = ( xi +a - a) ( x'i +b - b) bj (xi -a ) + (xi -a)(x'i -b). The provider restores the values ​​of xi - a and x'i - b. From equation (2), it can be seen that at this time, the service provider Sj can locally calculate the share of xi x'i cj + aj (x'i -b) + service provider Sj only needs to calculate sxij - aj and sx'ij - bj, and then serve has shares sxij of integer xi and integer x'i Share sx'ij. To calculate xi × x'i, Party Sj already has shares a, b, c aj bj cj. In addition, service provider Sj amount, and then calculate according to formula (1). Literature [16] adopts the Beaver pre-calculation method. Assume that the integer c = ab, and the service is provided first converting the additive sharing share into the share of any access structure When the secret sharing scheme is additive secret sharing, the literature [15] adopts is szj. Then, after obtaining xi × x'i, the direct product is not After the scaled secret part spij , it needs to be scaled according to Algorithm 2 to meet the requirements of the fixed-point number. When 2 contains k ratio of specific points When doing the inner product of the vector, you can also calculate the (3, 3) share of each component locally according to formula (1), After doing the local addition, we get the (3, 3) share of the final inner product, and then divide by 2d according to the above process to get the final share of the inner product. (srj, srdj, srj ', srdj'), j' = j + 1 mod 3. Assume that the (3, 3) share of 0 at Sj srd1 + srd2. The service provider Sj has the secret share of these 2 random numbers When a k-bit integer is regarded as a fixed-point number, it is assumed that it contains d bits The small number part of , then the multiplication operation also needs to be divided by 2d to maintain the quality of the fixed-point number. Assume two random numbers r = sr0 + sr1 + sr2 and r/2d = srd0 + +sxi2 sx'i0 + sxi0 sx'i2 + sxi2 sx'i2 (1)






























































    generation algorithm, as shown in Algorithm 3. In this algorithm, x'ij is a new random
    number x'i = sx'i0 + sx'i1 + sx'i2 The secret share at the service provider Sj.
    The principle of Algorithm 3 is that for the secret sharing of any access structure,
    the number of secret shares of everyone and must be a multiple of the maximum unqualified access structure set size, so that the XOR of all secret shares will eventually lead to 0. For example, sx 'i0 Åsx'i1, sx'i1 Åsx'i2 and sx'i2 Åsx'i0 are the three secret shares of 0. Literature [13] adopts this method. Multiplication pre a> The calculated share is completed using Algorithm 4 in the literature [15]. It is particularly important to note that the formulas (1) and (2) show The algorithm is suitable for multi-center models and is not limited to 3 service providers. For example, the general access structure secret sharing using the (3, 5) threshold scheme, the calculation method of formula (1) still holds. In fact, when the number of service providers is large , there is a more efficient calculation Multiplication method. When the secret sharing method is Shamir secret sharing, the literature [18] gives the use of two gates. Algorithm 3 Pseudo-random number secret Share generation algorithm Input: Service provider Sj owns xij = (sxij sxij' ), j' = j + 1 mod 3, execution round number z Output:x'ij = (sx'ij sx'ij' )












  21. The service provider Sj calculates sx'ij =
    PRFsxij ( z ) and sx'ij' = PRFsxij' ( z ), j'= through a pseudo-random number generation function PRF j + 1 mod 3 .
  22. Output x'ij = (sx'ij sx'ij' ) .
    Algorithm 1 Participants share private data algorithm
    Input: Participation The k-bit private data xi mod 2k
    of Pi is output: the secret share xij,0 ≤ j ≤ 2 given to the service provider Sj
  23. Pi randomly selects sxi0 sxi1 mod 2k, and calculates sxi2 = xi - sxi0 - sxi1 mod 2k.
  24. Pi generates the secret share xi0 = (sxi0 sxi1 )xi1 = (sxi1 sxi2 )xi2 = (sxi2 sxi0 ) .
  25. Output xij,0 ≤ j ≤ 2 .
    Algorithm 2 Scaling algorithm for products in fixed-point multiplication of secret shares
    Input: Service provider Sj The secret share of (spij  szj  srj, srdj, srj' , srdj'), j'= j +
    1 mod 3,0 ≤ j ≤ 2
    Output: secret share of xi × x'i fixed-point product
  26. Service provider Sj public spij + szj - srj .
  27. All service providers recover together tmp = xi × x’i - r .
  28. The service provider Sj computes tmp/2d + srdj and tmp/2d + srdj’ .
  29. Output (tmp/2d + srdj tmp/2d + srdj') .
    2263
    Journal of Electronics 2023
    limit to calculate the multiplication method. Assume that the smaller threshold is t, the service provider Sj has a t threshold share of xi x(t) The t threshold share x'(t) of ij and x'i ij , and the t threshold share r(t) of the same random number r ) j and 2t threshold share r(2t) j . The calculation of the multiplicative share when is shown in Algorithm 5 . Algorithm 5 cannot be used in the three-center model because the threshold is too small, so the secret share is the secret value itself. However, for the more general model When the central model is used and the tolerance ratio is smaller than , this method only needs one secret recovery operation and consumes a pair of random shares to complete. A multiplication. Completing some complex operations on the basis of secret shares requires a relatively high amount of pass. Taking comparison operations as an example, assume that the service provider Sj Having the share xij of k bits xi under a certain secret sharing scheme and the share x'ij of x'i of the same length, the service provider now wants to get The share of xi > x'i - the share x-j of xi, and then according to Algorithm 6, the share of the sign of the difference x±j is obtained, and then the share of xi > x'i 1 - x±j is obtained. In order to calculate x±j, the service provider needs to precompute the shares of k binary random numbers, the shares of a random number α and the random number β Î { + 1 - 1} share. Assume that service provider Party Sj participates in pre-calculation and obtains r(1) j r(2) secret sharing scheme. Under the semi-honest model, it can be based on the secret share limits the number of service providers. The number is three, and it is agreed to use any access structure Garbled circuit technology with fewer communications times can be used. In the literature [13], If possible To limit the number of service providers in the multi-center model, log k level protocol, this protocol is more efficient. O ( )k Legendre symbol round, the communication volume is O(log k) level, the communication volume is O(k) level. Literature [20] obtained the communication number constant by using communication, considering concurrent operations, the number of communication rounds of the above comparison protocol is one consecutive multiplication and k multiplications. Because one multiplication requires at least one Without considering the cost of precomputation, Algorithm 6 requires 2 times Secret recovery, j , αj and βj. j r( )k


































Guess you like

Origin blog.csdn.net/qq_43471489/article/details/134943789