Security risk analysis and countermeasures for the circulation and use of data elements

Security risk analysis and countermeasures for the circulation and use of data elements

Liu Yezheng1,2 , Zong Lanfang1 , Jin Dou1 , Yuan Kun1,2

1 School of Management, Hefei University of Technology, Hefei, Anhui 230009, China

2 National Engineering Laboratory of Big Data Circulation and Transaction Technology, Shanghai 201203

Abstract : This paper systematically analyzes the security risks existing in the process of circulation and use of data elements, and based on summarizing domestic and foreign data transaction systems and norms, theories and technologies, builds a pre-event-post-event full-link data element circulation and use The security risk response strategy proposes a safe and credible system construction plan for the circulation and use of data elements in which management and technology are mutually coordinated. Controllable data transactions provide reference to promote the stable and sustainable development of the data transaction market.

Key words : circulation of data elements; use of data elements; security risk analysis; security risk management; security risk technology

12f7594b03117fa8700458b33e255d53.jpeg

Paper citation format:

Liu Yezheng, Zong Lanfang, Jin Dou, et al. Security Risk Analysis and Countermeasures for the Circulation and Use of Data Elements [J]. Big Data, 2023, 9(2): 79-98.

LIU Y Z, ZONG L F, JIN D, et al. Security risk analysis and countermeasures in the circulation and use of data factors[J]. Big Data Research, 2023, 9(2): 79-98.

a0ea19807b7de2659207cf1353451aac.jpeg

0 Preface

As a new factor of production in the digital economy, data is the foundation of digitization, networking, and intelligence. The circulation and use of data elements helps to promote data fusion and resource integration, activate data potential, and make the digital economy stronger, better and bigger. However, the circulation and use environment of data elements is complex, involving multiple parties and links, and data products are easy to copy, non-exclusive, and difficult to trace, all of which make data circulation and use face security risks and privacy leakage challenges. This not only threatens national data security, but also is not conducive to the protection of the digital rights and interests of enterprises and individuals, and seriously hinders the circulation and use of market-oriented allocation of data elements. The state has issued a series of policy documents to coordinate and promote the safe, credible, intensive and efficient circulation and use of data elements. On December 2, 2022, the "Opinions of the Central Committee of the Communist Party of China and the State Council on Building a Data Basic System to Better Play the Role of Data Elements" clearly pointed out that data security is the bottom line and red line of data element circulation transactions, and is the primary condition for carrying out data circulation transactions. Coordinate development and security, implement the overall national security concept, strengthen the construction of a data security guarantee system, and implement security throughout the entire process of data supply, circulation, and use. At the same time, many national laws and regulations such as the "Data Security Law of the People's Republic of China" (hereinafter referred to as the "Data Security Law") and the "Personal Information Protection Law of the People's Republic of China" (hereinafter referred to as the "Personal Information Protection Law") have been promulgated one after another. Relevant provisions have been made on data security and privacy protection during the circulation and use of elements. In order to ensure the safe, reliable, intensive and efficient circulation and use of data, identifying the security risk management requirements in the whole process of data element circulation and use, and constructing an effective security risk management response strategy have become hot issues of concern in the academic and industrial circles.

At present, the security risk management research on the circulation and use of data elements is mainly carried out independently from the two aspects of management system construction and security technical support: on the one hand, through the formulation of regulations, systems, and standards to clarify data transaction security risk requirements and security requirements; on the other hand, through technical means Solve the problem of data transaction security risk management and control. In terms of building a security risk management system, the state has successively promulgated many laws and regulations such as the "Data Security Law" and "Personal Information Protection Law", and local governments at all levels have also formulated relevant systems, such as the "Shanghai Municipal Data Trading Place Management Implementation Measures (Solicitation of Opinions) draft), "Tianjin Interim Measures for the Administration of Data Transactions". In addition, the national standard "Information Security Technology-Data Transaction Service Security Requirements" regulates the security risk management requirements in data transactions from the three perspectives of data transaction participant security, transaction object security, and data transaction process security. Scholars focus on mechanism issues such as data classification and classification, data desensitization, data storage, transaction subject qualification review and transaction process security audit. In terms of security risk response technology, focusing on the security requirements of access control, tamper-proof, and traceability of data circulation transactions, scholars are currently mainly exploring the application of technologies such as blockchain, federated learning, digital watermarking, and data encryption in the data circulation transaction market. . With the technical idea of ​​decentralization and multi-party supervision, Fan Hang et al. combined multi-party secure computing with blockchain smart contracts, and proposed a "computing contract" with safe and controllable data circulation, realizing "controllable and measurable uses". ". Thapa C et al. proposed that technologies such as homomorphic encryption and zero-knowledge proof can be used in the blockchain to encrypt private data to achieve the purpose of protecting private data.

Overall, the research work on the security risk of data element circulation includes not only the design of the management mechanism for the whole process of data circulation, but also the extension of data transaction security risk management from transaction objects to transaction subjects and transaction processes, and the security measures supporting the circulation and use of data elements. Technology and algorithm models. However, combined with the practical experience in the use of data circulation, the current domestic data transaction service organizations have problems such as multiple uncertain application scenarios, single and inflexible technical architecture, and incompatibility between technology and scenarios, which lead to security risk requirements for safe and reliable circulation of data elements. The connotation is not clear, and the coping strategy is not clear. In addition, the construction of the management system is separated from the research on security technology, especially the boundary between technology and management roles in the compliant and credible circulation transactions of data elements is not yet clear, and technology and management have not formed a joint force.

The overall structural framework of this paper is shown in Figure 1. It systematically analyzes the security risk problems existing in the circulation and use of data elements. Coordinated data circulation uses a full-link security risk management strategy to provide a reference for ensuring a fair, efficient, safe and orderly data element market.

e3cbf90fce5444e5e2d884c413d3d849.jpeg

Figure 1 The overall structural framework of this paper

1 Security risk analysis of the circulation and use of data elements

Compared with ordinary commodities, data elements have the characteristics of high fixed cost, low marginal cost, unclear property rights, diverse sources, and changeable structure. In the process of circulation and use, the scope involved is wider, the subjects are more diverse, and the process is more complicated. Therefore, data elements are more difficult than ordinary commodity circulation transactions, and there are more security risks in the process of releasing the value of data elements. Based on relevant research at home and abroad, systematically and comprehensively analyze the security risks of the circulation and use of data elements from the three perspectives of business life cycle, data life cycle, and circulation and use environment, laying a demand foundation for building a safe and credible system for the circulation and use of data elements.

1.1 Security risk analysis from the perspective of business life cycle

The business life cycle refers to the whole process of the circulation and use of data elements. According to the references, this paper divides the data element business life cycle into four stages: transaction application, transaction matching, transaction implementation and transaction end.

The security risks at the transaction application stage can be summarized as transaction subject qualification security risks, data access security risks, and product quality risks. The process of circulation and use of data elements involves multiple entities such as suppliers, demand parties, and transaction service agencies. The qualifications of entities are directly related to the legality and compliance of data sources and circulation and use. Xiao Jianhua and others believe that different transaction entities should have different qualification review requirements. For legal entities, the trading platform needs to review their legal person information, business license, tax information, etc.; for individual entities, the trading platform needs to review their identity information, transaction purpose, data use scope, etc., to ensure that data transaction participants are not prohibited by laws and regulations. or any restrictions. Data is the subject matter of circulation and use. If non-compliant data flows into the market, it may seriously affect personal privacy security, commercial security, and national security. Data access security risks need to focus on whether data products include prohibited transaction data, unauthorized Personal data, commercial confidential data, etc. In addition to meeting the security requirements for access, data elements participating in circulation and use must also consider data quality risks. If forged or wrong data is put online due to lax review, the analysis results based on the data may be invalid and cause huge losses to the demand side.

In the transaction matching stage, there are mainly supply and demand matching risks, transaction fairness risks and transaction transparency risks. In terms of supply and demand matching, the data market is flooded with a large amount of data. Faced with abundant data supplies of different scales and different priorities, it is very difficult to find the data that best meets the needs. Whether the matching can be matched in terms of time and quality becomes supply and demand matching greatest risk. In terms of transaction fairness, since most of the data circulation is carried out through the data transaction platform that acts as both the organizer and referee of the transaction, if the transaction platform colludes with the buyer or the seller, the fairness of the transaction will be difficult to guarantee. In addition, due to the data The marginal cost of products is close to zero, so sellers have greater flexibility to price discriminate. In terms of transaction transparency, the supply side is often faced with the challenge of how to sell data and which data is more valuable. The demand side cannot obtain transparent access to data and cannot understand the authenticity of the original data. There is a lack of transparency guarantees for things like discovery and storage.

The security risks in the transaction implementation stage are mainly reflected in the distribution of authority, pricing and transaction clearing and settlement. In data transactions, not only the data itself is traded, but also the various rights related to it. Whether the exclusive rights claimed by all participants after the delivery of data products can be guaranteed is related to the smooth progress of data element circulation transactions. As a special product, data has huge differences in cost, consumption unit, aggregation, consumption mode, reuse and resale compared with traditional commodities, leading to different considerations in pricing principles and methods. Versioning becomes a common mechanism for designing and pricing data elements, and prices for different versions can be associated with value for different customer segments. This puts forward a series of new requirements for the pricing of data elements, including fairness, no arbitrage, authenticity, privacy protection, computing efficiency, etc. At the same time, the pricing of data elements also faces the risk of manipulation similar to traditional markets, that is, maliciously suppressing or driving up prices. When the transaction is cleared and settled, both the supply and demand parties may face the risk of transaction default. Whether the authenticity, timeliness and completeness of the data received by the demand party after payment is consistent with the supplier’s claim, and whether the supplier will refuse to deliver due to the demand party’s occurrence , Repudiation and other behaviors that make it impossible to get the agreed payment.

At the end of the transaction, there are mainly security risks such as illegal use, resale, and re-identification. At the closing stage, security risks mainly come from the demand side. After the data is delivered to the demander, there may be dishonest data demanders who do not follow the agreement but use the data beyond the scope, infringe on the legitimate rights and interests of the supplier, and even threaten the security of multiple parties, and the demander re-purchase the purchased data products. Risks such as circulation and resale. Although the data involving user identity information has been cleaned, encrypted, and anonymized before the data transaction, with the continuous increase of public information and the continuous development of Internet information technology, the anonymized data may be redistributed. identify.

1.2 Security risk analysis from the perspective of data life cycle

The data life cycle refers to the whole process of data from generation or acquisition to destruction. According to the relevant operation process for the circulation and use of data elements, the data life cycle is divided into four stages: collection and storage, delivery and transmission, processing and use, and backup and destruction.

The security risks of collection and storage mainly include collection security risk, infringement risk and storage security risk. The quality standard of data collection will affect the data quality of the entire link, and the authenticity, integrity, and reliability of the original data are directly related to the subsequent data mining and analysis work. If the collected raw data cannot reflect the objective and real situation, the prediction results of the model based on this will be biased, which will affect the usability of data products. Data collection also needs to strictly abide by relevant legal principles such as users’ informed consent and minimum necessity. However, in practice, many smart device manufacturers and App companies over-collect users’ personal information in order to achieve precise marketing and obtain more accurate user portraits, even “ "Monitoring" the user's smart device, making the user a transparent person in the cyberspace, seriously infringing on the individual's right to know and privacy. Data is generally stored in the cloud or in a distributed file system. Direct encryption on the cloud will bring huge computing overhead and increase the risk of key management. Attacks on one or more nodes in distributed storage may directly affect the calculation results.

The security risks of delivery transmission mainly come from network hardware risks and external attack risks. During the long-distance network transmission of data, it faces the risk of data packet loss caused by network instability, and the risk of transmission timeliness caused by insufficient network bandwidth, especially in the face of large-scale data transmission. Network hardware risks will become more prominent. Data is quickly clustered and forwarded in multiple paths, and is vulnerable to virus implantation and attacks. The collection and transmission of large-scale data will reduce the cost of external attacks and increase the income of a single attack, thereby causing hackers to attack. Sharing and generating keys between users and servers is an important risk point in data transmission, and social engineering has become an important means of external attacks and data theft.

The security risks of processing and use are prominently reflected in the risks of privacy disclosure, security attacks and data abuse. Desensitized data and modeled data that can be traded and traded from raw data must be desensitized, analyzed, and tested with the help of big data technology. However, big data technology faces two types of privacy leakage risks during the learning and training process, namely non- The privacy leakage risk of authorized users directly obtaining data and the privacy leakage risk of attackers inferring sensitive information in the data set through certain methods. When data is processed and used, it is vulnerable to attacks from multiple parties, such as forging or modifying data, attacking model parameters, and maliciously attacking servers. Due to the difficulty of monitoring and measuring the use and amount of data elements, driven by interests, there may be excessive use of data in the process of data use, and even breed an industrial chain of illegal data transactions, causing serious harm to personal privacy and national security.

The security risks of backup destruction include backup audit security risks and destruction security risks. After the data circulation transaction is completed, relevant transaction logs need to be generated and backed up. However, there may be unauthorized changes or deletions in the backup process, and different machine backups, etc., which cannot provide reliable basis for query, analysis, audit and dispute arbitration during the transaction process. . Data destruction security refers to a series of prevention and control measures taken to prevent data from being restored by establishing a data deletion, destruction, and purification mechanism when clearing data in systems and equipment involved in regulatory business and services. Untimely and incomplete destruction provides opportunities for insiders and hackers, which may cause data leakage, re-identification of personal information, secondary resale of data, etc., especially when data is stored in the cloud, cloud service providers It may refuse to destroy the data according to the user's deletion instruction, but maliciously retain the data, thus exposing the data to the risk of being leaked.

1.3 Safety Risk Analysis from the Perspective of Circulation and Use Environment

Circulation and use environment refers to the environment involved in the entire business life cycle of data elements in circulation and use. Specifically, it can be divided into three parts: circulation trading platform, software environment, and hardware environment. During the circulation and use of data elements, the entire process from transaction application to transaction completion is completed on the circulation trading platform, and various specific operations such as detection, desensitization, and mining rely on the environment of the circulation trading platform. At the same time, calculation operations such as collection and sorting of data elements, modeling analysis, etc. are realized by relevant algorithms in the software environment, and the operation of algorithms in the software requires computing resources provided by the hardware infrastructure to complete.

The security risks of circulation trading platforms are mainly manifested in access control capabilities, environmental response capabilities, operational capabilities, and content exchange control capabilities. The access control capability means that beneficial users should be able to access the system, while harmful users should be denied, reflecting the scalability and security of the platform. Environmental adaptability refers to the flexibility and reliability that the platform should have for internal and external changes. On the one hand, it reflects that the platform can operate in different environments, and on the other hand, it reflects the relative stability of the internal structure of the platform. Operating capability refers to the performance of the platform to effectively realize the circulation and utilization of data elements, usefulness reflects the transaction processing capability of the platform, and ease of use refers to the ability to occupy the minimum system resources when implementing business functions to ensure the operating performance of the system, such as fast access speed , Easy to operate, etc. Content exchange control capability refers to the connectivity and privacy of the platform, which requires both normal content exchange and privacy protection.

The security risks of the software environment are reflected in system software risks and application software risks. The circulation and use of data elements requires the support of various system software and application software. There are various loopholes in these softwares, and even malicious codes are hidden. It is very difficult to detect malicious codes in such software. Circulation use brings huge potential risks. Algorithms are a special class of applications in the circulation of data elements. With the application of various deep learning models and collaborative learning models, the calculation logic and interaction logic of the algorithm are becoming more and more complex and diverse, which makes the interpretability of the algorithm results unsatisfactory, and the security of the algorithm itself is also difficult to control. In addition, the design of many algorithms is based on certain security assumptions, such as assuming that multiple participants abide by the specified rules and protocol procedures and there is no accomplice, etc., which adds an additional security assumption risk, that is, when the security assumption of the algorithm cannot be When satisfied, the algorithm results may be unpredictable.

Hardware environment security risks refer to key information infrastructure security risks required for data storage and operation, and are mainly divided into computer physical security and computer network security. Computer physical security risks include abnormal damage, theft, and illegal use of computers; computer network security risks include attacks on computer network equipment, computer network systems, and databases. In addition, whether the manufacturer who supplies and builds the hardware environment is trustworthy, whether there have been incidents of automatic reading of device information and product quality failures without permission, whether there is a fault in the device, whether there is a delay in transmission, whether there is a hardware Trojan horse, etc. are all related to the hardware. Environment-related security risks. If hardware devices are vulnerable to attacks and frequently fail, it will seriously affect the healthy development of industries related to data elements.

2 Analysis of countermeasures against security risks in the circulation and use of data elements at home and abroad

The digital economy is gradually entering a period of high-quality development, and the data element market is paying more and more attention to data security. Issues such as the compliance review of data elements before entering the market, the use and amount control during circulation and use, and dispute resolution after circulation and use have put forward higher requirements for the security governance and security protection of data elements. Therefore, from the perspective of "pre-examination→in-process monitoring→post-audit", this paper summarizes the system and policy, theory and technology of existing data element security risk response strategies at home and abroad. At present, relevant work at home and abroad is mainly concentrated in two aspects: the construction of systems and norms and the support of theory and technology: on the one hand, through the formulation of policies, systems, and standards to clarify the safety risk management requirements for data circulation and use, on the other hand, through theoretical and technical means to solve data circulation Use security risk management and control issues.

2.1 Analysis of coping strategies from the perspective of system and norm construction

In recent years, my country's data element market has developed rapidly, and the market scale has expanded rapidly. The "China Data Elements Market Development Report (2021-2022)" shows that the market size of my country's data elements will reach 81.5 billion yuan in 2021, and it is expected that the compound growth rate of the market size will exceed 25% during the "14th Five-Year Plan" period. In order to prevent data element market security risk events, the state has issued a series of policy documents and rules and regulations to coordinate data element security risk management. The "Fourteenth Five-Year Plan for National Economic and Social Development and Outline of Long-term Goals for 2035" issued in March 2021 clearly stated that it is necessary to cultivate standardized data trading platforms and market players, and develop data asset evaluation, registration and settlement, and trading Matchmaking, dispute arbitration and other market operation systems. In November 2021, the "14th Five-Year" Big Data Industry Development Plan issued by the Ministry of Industry and Information Technology not only mentioned the content of the construction of the data element market again, but also focused on accelerating the cultivation of the data element market and giving full play to the characteristics of big data. 1. Consolidate the foundation of industrial development, build a stable and efficient industrial chain, create a prosperous and orderly industrial ecology, and build a solid line of defense for data security. On December 2, 2022, the "Opinions of the Central Committee of the People's Republic of China and the State Council on Building a Data Basic System to Better Play the Role of Data Elements" emphasized the improvement of the full-process compliance and regulatory rule system for data, starting from the aspects of full-process governance and innovative regulatory mechanisms, and proposed A data element security governance system with a bottom line.

Pre-examination is the premise of security risk management and control of the circulation and use of data elements. It mainly refers to the market or market managers reviewing the participants and data products in the data trading market in accordance with relevant laws and regulations before the transaction, so as to realize the "listing of data with review, purchase Qualified". At the national level, the "Data Security Law" clearly stipulates that data transaction service agencies should review the identities of both parties to the transaction, transaction data content, and data security risks, and keep review and transaction records. At the local level, Tianjin has promulgated the "Tianjin Interim Measures for the Administration of Data Transactions", in which Chapters 2 and 3 put forward a series of clear requirements for data transaction entities and transaction data respectively. Within the industry, measures are taken to ensure that data sources are compliant and credible, and data quality is safe and controllable. For example, the data transaction rule system released by Guiyang Big Data Exchange includes "Guidelines for Data Transaction Compliance Review" and "Guidelines for Data Transaction Security Assessment" "Guidelines for Data Vendor Access and Operation Management", etc., to ensure the credibility and controllability of transaction subjects and transaction objects during the circulation and use of data elements. However, the legal system in terms of data classification and classification management, data authorization and other aspects needs to be further improved. For example, Yuan Kang and others believe that although the "Data Security Law" clearly states that the state will implement hierarchical and classified protection for data, it only makes general regulations and lacks a detailed classification system and related implementation rules. Different regions and different departments Inconsistent procedural standards can easily lead to conflicts between data access and supervision. Wang Jiandong and others pointed out that although the "Data Security Law" and the "Personal Information Protection Law" have resolved the issues of data national sovereignty and personality rights at the legislative level, the issue of data property rights has not yet been clearly defined at the legal level, and the data elements can be copied. Unique features such as uncertainty and uncertainty are the difficulties in establishing a data property rights system, and the review of data sources participating in transactions has brought operational difficulties.

On-the-fly monitoring is the basis for security risk management and control of the circulation and use of data elements. The purpose is to control the use and amount of data usage, restrain the behavior of transaction subjects, and supervise the compliance of transaction orders. The "Opinions of the Central Committee of the Communist Party of China and the State Council on Building a Data Basic System to Better Play the Role of Data Elements" proposes to establish a compliant and efficient data element circulation and transaction system, improve the full-process compliance and regulatory rule system for data, and build a standardized data transaction market. Local governments have successively issued relevant policies to promote the safe and reliable circulation of data elements. Beijing issued the "Beijing Digital Economy Promotion Regulations", requiring the improvement of data classification and classification, security risk assessment and security assurance measures, the establishment of data governance and compliance operation systems, and the security assessment of anonymization and de-identification technologies based on application scenarios. Carry out standard certification in data security. Shanghai issued the "Shanghai Data Regulations" to support the orderly development of data transaction service organizations, requiring data transaction service organizations to establish a standardized, transparent, safe, controllable, and traceable data transaction service environment, formulate transaction service processes, internal management systems, and Take effective measures to protect data security. The "Guidelines for Compliance Review of Data Transactions" released by Guiyang Big Data Exchange also requires compliance review of the content and delivery methods of transaction contracts, and also provides "Guidelines for Data Product Cost Evaluation 1.0" and "Guidelines for Data Product Transaction Price Evaluation 1.0" "Guidelines for Valuation of Data Assets 1.0" provides value assessment and price basis for data transactions. However, there are still obvious deficiencies in the pricing mechanism and data transaction legislation. At present, the price mechanism of different data trading platforms is not transparent. For example, the price of "provincial business platform data service" on a certain platform is 3.5156 million yuan per time, while the price of "computing resource service (cloud computing service)" is 0. 01 yuan/time. Therefore, it is necessary to improve and unify data circulation pricing rules, standardize data consumption units and consumption methods, and prevent pricing from being too arbitrary. In terms of legislation, regulations on the circulation and use of data elements are scattered in the "Civil Code of the People's Republic of China", "Personal Information Protection Law", "Data Security Law", "Network Security Law of the People's Republic of China" (hereinafter referred to as "Network Security Law") ), "Anti-Monopoly Law of the People's Republic of China", and "Anti-Unfair Competition Law of the People's Republic of China", there is no special law on the circulation and transaction of data elements. In contrast, the United States passed the "Data Brokers Accountability and Transparency Act" in 2014, and the "Data Brokers Act of 2019" in 2019, requiring data brokers to clarify the source and type of data, use, save and The manner in which data is distributed, the extent to which consumers are allowed to access and modify data, the means by which consumers can opt out of data sales or sharing, etc.

Post-event audit is the key to the security risk management and control of the circulation and use of data elements, and the purpose is to resolve post-transaction disputes. Regarding the credit system of the data element market in the "Opinions of the Central Committee of the Communist Party of China and the State Council on Building a Data Basic System to Better Play the Role of Data Elements", it is proposed that a transaction arbitration mechanism needs to be established to manage and evaluate the credit of data transaction subjects. In the data element market Form a transaction ecology of integrity, mutual trust and credibility. Within the enterprise, the Beijing International Big Data Exchange issued the "Guidelines for Beijing Data Transaction Services", which implements the principle of derivation of data transaction protection obligations, guarantees the scope of use and prohibited uses specified in the transaction, and establishes a data element property right knowledge protection system. Establish dispute resolution mechanisms between buyers and sellers. The "Guidelines for Data Transaction Compliance Review" released by Guiyang Big Data Exchange also includes compliance review of scenario applications and new derivative data products after transactions. However, the data leakage notification system and data supervision authority still need to be continuously improved. Although the "Cyber ​​Security Law" provides relevant requirements for the data leakage notification system, there are no institutional elements such as the specific circumstances that need to be notified to users, the time limit and method of notifying users, remedial and punitive measures for data leakage, and the scope of subjects to which the system applies. To make clear regulations, lack of certain operability. In my country, data supervision is under the overall planning of the Ministry of Internet Information Technology, and each department of the industry supervises separately. However, in practice, the division of powers and responsibilities of various data supervision departments and dispute arbitration institutions is not clear, and the problems of shifting responsibilities to each other are not uncommon. The relevant systems of data supervision and dispute arbitration should be improved , clarify relevant powers and responsibilities, and form a dual security guarantee of industry self-discipline and government supervision.

2.2 Analysis of coping strategies from the perspective of theory and technology

2.2.1 Prior review

In terms of participant qualification review, identity authentication and control technologies are usually used to ensure the qualification security of transaction subjects and ensure that the identity information provided by data suppliers and demanders is authentic and reliable. Traditional identity authentication mainly includes identity authentication based on mark recognition, identity authentication based on biometrics, and identity authentication based on keys, but there are risks such as password leakage and forged biometrics. In recent years, blockchain technology has begun to be applied in the field of identity authentication. Blockchain has the advantages of decentralization and non-tampering, and can provide technical support for the security of subject qualifications. For example, Dixit A et al. use blockchain and decentralized identifier (DID) for subject verification in the Internet of Things data market. Each subject holds a unique DID. By verifying the DID on the client side, the platform is guaranteed The identity of the transaction subject on the website is identified. In terms of authority access control, Du Ziran and others proposed the TID-MOP security system framework to implement security management and control of data transaction applications from the aspect of technical support, and focus on the evaluation of the compliance qualification of transaction entities through centralized monitoring of operation and maintenance and access authority management.

In reviewing the legality, compliance, and authenticity of data elements, de-identification technology, sensitive data detection technology, and integrity technology provide technical guarantees for the safe access of data products. De-identification technology reduces the degree of association between the information in the data set and the information subject by de-identifying the original data, mainly including data statistics technology, suppression technology, anonymization technology, pseudonymization technology, generalization technology, randomization technology etc. Different de-identification technologies have different characteristics. Data suppliers can choose appropriate data de-identification technologies according to the characteristics and confidentiality levels of different transaction data, so as to ensure that data products can enter the data element market. Aiming at the problem of sensitive information contained in data products, He Wenzhu et al. proposed an automatic identification and classification algorithm for sensitive attributes oriented to structured data sets, using information entropy to define attribute sensitivity, and through sensitive attributes of any structured data set Carry out identification and sensitivity quantification to realize the classification and classification of sensitive attributes. Liu Jin established a sensitive data identification system by analyzing the life cycle of sensitive data circulation and combining data feature technology, so as to strengthen the control of sensitive data and reduce the risk of sensitive data entering the market. On the one hand, data integrity technology can guarantee the quality of data involved in transactions, and on the other hand, it can protect data from malicious tampering. Among them, cryptography technology and data copy strategy are two traditional data integrity technologies. Cryptography technology uses message authentication codes and hash trees to generate data signature information to prevent data from being forged; data copy strategy ensures data integrity by losing storage space. In practice, two methods are generally used comprehensively to ensure data quality and safety. Nasonov D et al. proposed a novel blockchain-based data element integrity verification technology for the circulation and transaction of enterprise data elements.

2.2.2 Monitoring during the event

Blockchain technology and privacy computing technology system are powerful means to ensure the security of computing environment, algorithm security and data privacy in the process of data circulation and use, and they are also feasible technologies for monitoring and credible transaction matching.

For example, in terms of monitoring the credibility of transaction matching, Tan WT et al. proposed a blockchain-based distributed transaction mechanism that considers credit management. Only when the user's credit score is not lower than the threshold can he be allowed to participate in the distributed transaction. . Gupta P et al. proposed a new blockchain framework TrailChain, which uses watermarks to generate trusted transaction traces, and realizes the monitoring across multiple decentralized The traceability and tracking of data ownership in the market.

In terms of ensuring the security of the computing environment, the trusted execution environment (trusted execution environment, TEE) can isolate sensitive computing from other processes (including the operating system, BIOS, and hypervisor), and use hardware technologies such as chips to cooperate with upper-layer software to process data. Protection, while retaining computing power sharing with the system operating environment, the main representative products are Intel's SGX, ARM's TrustZone, etc. Based on trusted execution environment and blockchain technology, Dai WQ et al. built a new data transaction ecosystem, in which neither the data agent nor the demand side can access the original data of the supplier, but only the required analysis results. The secure execution environment serves to protect data processing, source data, and analysis results.

In terms of algorithm security and privacy protection, rich research results have been obtained. For example, Thapa C and others proposed that the blockchain can use technologies such as homomorphic encryption and zero-knowledge proof to encrypt private data to achieve the purpose of protecting private data. Aiming at the privacy protection of credit data in the supply chain financial credit system, Zheng KN et al. proposed a blockchain-based shared transaction information access control and management model. Through the consensus mechanism, the access control and management of the shared data chain are realized. Traceability management. Zhang JL et al proposed FedMEC, a federated learning framework based on mobile edge computing, which integrates model partitioning technology and differential privacy technology to prevent the privacy leakage of local model parameters. Zheng Tingyi and others proposed a technology ecosystem framework consisting of three parts: supervision system, core technology and model innovation to ensure the security of platform data and algorithms.

2.2.3 Post Audit

Post-event audit mainly includes transaction credit audit and transaction security audit. The transaction credit audit mainly determines whether there are infringements and violations, and investigates responsibility, and establishes an effective credit evaluation mechanism. For example, Tan WT and others made use of technical features such as blockchain traceability and non-repudiation, and proposed that participants pay a certain amount of deposits to smart contracts as punishment for potential defaulters and compensation for defaulters. The smart contract executes the transaction settlement according to the contract performance, and automatically refreshes the credit score of the participants according to their performance this time. Tang HY and others used the side contract mechanism to establish a transaction dispute arbitration mechanism based on blockchain technology, which can not only solve the contract disputes between the two parties in the transaction, but also verify and trace the integrity and value of transaction data. Dellarocas C proposes a reputation mechanism design scheme to encourage the supply side to reduce opportunism as much as possible and prevent the transaction of data products that are of no value to the demand side.

The application of blockchain technology can not only guarantee the record security of each transaction, but also provide convenience for transaction security audit. For example, Fan KF et al. designed a blockchain-based cloud data auditing scheme, and proposed a decentralized auditing framework to eliminate the dependence on third-party auditors, while ensuring the stability, security and traceability of data auditing. , and better assist users in verifying the integrity of cloud data.

This article briefly summarizes the security risks and main countermeasures of domestic and foreign data element circulation transactions, as shown in Table 1.

e6da68a2eb9b9e7a454ae29afd4d161f.jpeg

3 Security risk response strategies for the circulation and use of data elements that are coordinated with management and technology

The continuous advancement and improvement of the construction of the basic data system has put forward higher requirements for the safe, credible, and compliant circulation and use of data elements. Respond to the security risks of data circulation and use, and build a full-process compliance and trustworthy system, relying on technology for three points and management for seven points. Management is based on systems and regulations, guaranteed by processes, feedback, and supervision, and centered on people. Technology is a means to efficiently implement systems and regulations, and a support system or tool to achieve effective management. The effective use of technology depends on the completeness of management rules and regulations. Based on this, first of all, according to the three different stages of data element circulation and use before, during and after the event, corresponding security risk response strategies are designed, and then a plan for building a safe and trusted system for data element circulation and use that is coordinated between management and technology is proposed. The integrated security guarantee mechanism for the whole process of supply, circulation, and application and the security responsibilities of each participant promote the safe and orderly circulation and use of data elements.

3.1 Before-during-after the whole link data element circulation and use security risk response strategy

Figure 2 shows the security risk response strategy framework for the whole link data element circulation and use proposed in this paper before, during and after the event. From the perspective of the whole process of data element circulation and use, the pre-event The review system, in-process monitoring system and post-event audit system regulate the safe and orderly circulation and use of data.

e0701fc5b45948f00a19379cc19e06a1.jpeg

Figure 2 Before-in-event-after-event full-link data element circulation and use security risk response strategy framework

3.1.1 Pre-examination system based on human-machine collaboration

The purpose of the pre-examination is to ensure that the subjects participating in the transaction are credible, the data is credible, the contract is credible, etc. at the transaction application stage. The examination of transaction subjects aims to review the security risks and compliance of the qualifications of data circulation users, build the registration process of transaction subject accounts, and design a supporting verification scheme for machine review and manual review for the authenticity of account registration information, so as to ensure the transaction platform and circulation transactions The information of the parties involved in the process and the market entities such as institutions or individuals can be traced to realize the credibility of the transaction entities. Transaction data and algorithm review is to check the security risks of collected and stored data elements, including data integrity, authenticity, tradability, the legitimacy of data acquisition channels, and whether the data de-identifies personal information to ensure data security. transactions and legal compliance. The purpose of the transaction contract review is to review the usage scenarios, data quality, data value, pricing requirements, and data update capabilities of data elements. It is necessary to formulate prohibited transaction data catalogs for different application scenarios, establish transaction standards for data products, and build standardized transaction contracts. Listing process and compliance review process to realize the credibility of transaction contracts.

3.1.2 In-process monitoring system based on intelligent monitoring management

The purpose of in-process monitoring is to ensure the safety and reliability of data element circulation transactions during the negotiation and implementation phases, including monitoring and management of transaction entities, contract negotiation, algorithmic behavior, and order fulfillment. Transaction subject monitoring and management focuses on transaction subject identification management. By designing a transaction subject identity and contract verification mechanism based on intelligent identification technology, it ensures the traceability of contract information such as signature information of both parties, hash value information of contract content, and private key management information. , to achieve controllability by data users. The monitoring and management of contract negotiation is based on the principle of fair trade and the principle of maximizing the efficiency of supply and demand matching. Through the design of automatic matching technology and smart contract technology with privacy protection, it can ensure that the contracts of both parties of the transaction meet market expectations and relevant national policies and regulations. Algorithm behavior monitoring management builds a model algorithm evaluation system and designs an algorithm behavior monitoring plan to ensure that data import, data preprocessing, model training, and result release are standardized and credible, the use process can be traced, and resource consumption can be measured to achieve data usage, The usage is consistent with the contract to ensure that the security risk of data processing and use is controllable. Order fulfillment monitoring and management Establish a data transmission interface filing system to dynamically monitor the performance of transaction entities, including perceiving and monitoring data flow, verifying data integrity and consistency, and auditing capital flow to ensure that orders are fully fulfilled, and to realize order information, supply and demand parties, and transactions. Traceability of data information generated during the contract performance process such as platform information, delivery and settlement information.

3.1.3 Post-event audit system based on blockchain storage

The post-event audit aims to prevent possible security risks that data may face after the transaction is completed, mainly focusing on three aspects: preventing data abuse, preventing data infringement, and preventing subjects from being dishonest. In terms of preventing data abuse, design a transaction audit mechanism based on information stored on the data chain, build a smart transaction audit verification index calculation system based on the contract information and transaction information stored on the chain after the transaction, and design a system for resource abuse on the chain Monitoring and identification programs. Formulate a data destruction review mechanism to eliminate the risk of reselling data products, and ensure that audit information such as transaction volume, abnormal transaction users, abnormal contract deployment, and data destruction process can be traced back. In terms of preventing data infringement, formulate a proof process mechanism for data transaction infringement, build an on-chain and off-chain inspection system for data infringement based on on-chain and off-chain clue search for data infringement, and ensure the traceability of information sources for infringement. In terms of preventing subjects from breaking their trust, establish a credit management mechanism for storing information on the chain after the end of data transactions, build a credit evaluation index system based on data market entities, and design an on-chain certificate storage scheme for credit evaluation of market entities' transaction behaviors to ensure that data suppliers Traceability of credit rating information of data market entities such as data demanders, trading platforms, etc.

3.2 A safe and trusted system for the circulation and use of data elements in which management and technology are coordinated

Supporting the safe and orderly circulation and use of data elements requires the construction of a full-process compliance and trustworthy system. The construction process is a complex system engineering, and the realization path depends on the mutual guarantee and comprehensive effect of management systems and technical support. Figure 3 shows the compliance and trustworthy system and implementation path of the circulation and use of data elements proposed in this paper in which management and technology cooperate with each other. In Figure 3, ① represents the registration of participants in the transaction application stage and the corresponding management mechanism and technical support, ② represents the transaction matching stage, ③ represents the transaction implementation stage, and ④ represents the transaction closing stage and their corresponding management mechanisms and technical support.

3.2.1 A trustworthy and compliant system for the circulation and use of data elements throughout the entire process in which the management system and technical support are coordinated

The whole-process compliance and trustworthy system of data element circulation and use in which the management system and technical support are coordinated includes a compliance and trustworthy system, a compliance and trustworthy technology system, and a management system and supporting technology collaboration plan. The system of trusted circulation and use of data elements includes a pre-examination system, an in-process monitoring system, and a post-event audit system. The technical system includes data transaction system technology, blockchain system technology, cross-privacy platform federated learning system technology and trusted execution environment technology, etc. ①~④ marked in Figure 3 shows the collaborative scheme of management system and technical support at different stages of data element circulation and use.

03fd36ef9a48dfc0eff4319e4bd1be88.jpeg

Fig. 3 The compliance and trustworthy system and realization path of the circulation and use of data elements in which management and technology are coordinated

In the pre-examination stage of data circulation and use, formulate a review system for transaction entities, transaction data, and transaction contracts to deal with participants and data collection security risks. Technically, the method of "machine review + manual verification" is adopted to ensure that the review process is compliant and credible, that is, for standard information such as qualification information, data quality, and transaction entries, such as corporate legal person information, business license, data scale and magnitude, prohibited The list of transaction data, etc., adopts the method of automatic review and manual random inspection based on machine learning algorithms. For data attributes with high subjectivity such as transaction purpose and data source, manual verification methods are adopted. In the monitoring stage of data circulation and use, formulate a security guarantee system for the platform system and software and hardware, data, cloud, network, terminal and other links involved in circulation and use, and build a transaction entity monitoring management system, algorithm behavior monitoring management system and order fulfillment Monitoring management system. Technically design a security system based on intelligent algorithms, such as participant identity authentication based on intelligent identification technology to ensure the credibility of participants; data authority management methods based on identification technology to achieve controllable access to delivered data; abnormal data usage detection Advanced automatic sensing technology to monitor data compliance processing and use; data circulation and use process information storage certificate based on blockchain technology to ensure that the whole process of data circulation and use can be traced. In the post-audit stage of data circulation and use, formulate data abuse audit system, data infringement audit system, and entity dishonesty audit system, aiming to ensure that the whole process of data circulation and use is compliant, disputes can be adjudicated, and rights and interests can be guaranteed. Technically design a re-audit system based on blockchain storage information to conduct security audits on the entire process of data circulation and use; a data tracking system based on data identification and associated technologies to inspect and collect evidence for infringements such as secondary circulation and resale of data ; Integrate the credit evaluation system of transaction subjects and blockchain traceability technology to build a comprehensive data credit evaluation service and promote the fair and credible development of the data circulation market.

3.2.2 Construction plan for the whole process compliance and trustworthy system of data element circulation and use

The whole-process compliant and credible system of circulation and use of data elements includes not only the macro-based system that guides the integrated implementation of the circulation and use of data elements across the country, but also the mesoscopic system that guides local governments to implement the circulation and use of data elements in their regions. Trading institutions implement micro-systems for the circulation and use of data elements. In terms of national and local macro-system and meso-system construction plans, a "top-down" approach is adopted to build a compliant and credible basic system for the entire process of data element circulation and transactions. In terms of the micro-system and meso-system construction plans of local and trading institutions, the "bottom-up" thinking is adopted to build a compliant and credible operating system system for the entire process of data element circulation and transaction. In terms of the implementation guarantee of the safe and credible system, formulate the training policy of the whole-process compliance and credible system system of data element circulation and use, implement the supervision policy of guarantee policy and system implementation, and ensure the effective implementation of the full-process compliance and credible system of data element circulation and transaction .

The full-process compliant and trusted technology system for the circulation of data elements includes not only the national integrated infrastructure that supports the circulation and transactions of data elements, but also the infrastructure for various data trading institutions to support credible, controllable, and measurable circulation transactions of data elements. In the construction of national integrated infrastructure, based on national infrastructure construction strategies such as "East Counting and Western Computing", clarify the construction of safe and credible infrastructure and circulation environments such as national integrated data centers, computing power centers, algorithm centers, and security centers According to the demand, put forward a corresponding construction plan to provide a safe and credible circulation environment, common public services, and green and efficient computing power guarantee for the circulation and use of data elements. A privacy collaborative computing platform with functions such as control and control, and a cross-chain collaborative trading platform designed for mutual trust between transaction subjects, data registration interconnection, and untrustworthy list interoperability, providing safe and credible technical guarantees for the safe and credible circulation and use of data elements. In terms of security and trustworthy technology construction guarantee and interconnection, it is recommended that the country carry out key engineering projects and special action plans such as relevant technical research and basic theory exploration, and use key engineering projects and special action plans as a guide to establish national and local governments. The synergy mechanism for the joint investment and construction of trading institutions and the interconnection mechanism of various infrastructures will establish a safe, credible, intensive and efficient national integrated data element circulation and use environment.

4 Summary and Outlook

In recent years, the cultivation of the data element market and the security risk management of the circulation and use of data elements have received widespread attention from all walks of life. Starting from the whole process of circulation and use of data elements, systematically and comprehensively analyze the security risks in the whole process of circulation and use of data elements from the three perspectives of business life cycle, data life cycle, and circulation and use environment, sorting out and summarizing the circulation and use of relevant data elements at home and abroad On the basis of the system and norms, theory and technology of the security risk response strategy, the security risk response strategy for the circulation and use of data elements in the whole link before, during and after the event is given, and the security risk of the circulation and use of data elements that is coordinated by management and technology is proposed. The credible system and its implementation path provide reference for the realization of credible and controllable data circulation transactions with "data sources can be confirmed, use scope can be defined, circulation process can be traced, and security risks can be prevented", and promote a safe, orderly and stable data market Continuous development.

As the development of the digital economy enters a new era, the research on security risk management strategies for the circulation and use of data elements in the future can continue to be carried out from the following two aspects.

● Security risk management strategies for the circulation and use mechanisms of data elements in different application scenarios. Data element products can be divided into raw data, desensitized data, modeled data, AI data services, etc., which are sourced or served in different application scenarios such as government, business, finance, medical care, environment, and individuals. They are divided into data resource holding rights, There are three types of ownership: the right to use data processing and the right to operate data products. The circulation and use of data elements refers to the orderly flow of data ownership among different transaction subjects. Different data products, different application scenarios, and different ownership exchanges have different requirements for security risks. For example, the transfer of data resource ownership involves the transfer of data dominance or control. The entire transaction process has the strictest requirements on security risks, and must have sound legal protection and technical support. In the future, more attention should be paid to the security risk management strategy of the circulation and use mechanism of data elements in different application scenarios, which will help refine the applicability, operability, and scalability of the security risk response strategies for the circulation and use of data elements, and further improve and improve the circulation of data elements The security governance system used.

● A security risk management strategy for a secure, credible, intensive and efficient national unified market for data elements. The current security risk analysis and management of the circulation and use of data elements is mainly aimed at data trading platforms and exchanges, and has not yet risen to the security risk management of the unified national data element market. With the continuous advancement and improvement of the construction of the basic data system, in order to fully release the potential of data elements, more and more people are paying more and more attention to building a national integrated data element circulation transaction framework and guarantee technology system. Among them, research on national integrated security and trustworthy infrastructure construction needs, research on hyperchain solutions based on private chains, alliance chains, and public chains, and cross-privacy computing platform security and trustworthy technology solutions are directions that require key breakthroughs.

About the Author

Liu Yezheng (1965-), male, Ph.D., professor of School of Management, Hefei University of Technology. His main research directions are e-commerce and cyberspace management, big data development and application.

Zong Lanfang (1998-), female, a master student at the School of Management, Hefei University of Technology. Her main research directions are personalized recommendation systems, unfair pricing algorithms, etc.

Jin Dou (1997-), female, is a master student at the School of Management, Hefei University of Technology. Her main research directions are e-commerce, car-cargo matching, etc.

Yuan Kun (1991-), male, Ph.D., lecturer at School of Management, Hefei University of Technology. His main research directions are business intelligence and big data analysis, social network analysis and personalized recommendation system, etc.

contact us:

Tel: 010-81055490

       010-81055534

       010-81055448

E-mail:[email protected] 

http://www.infocomm-journal.com/bdr

http://www.j-bigdataresearch.com.cn/

Reprint and cooperation: 010-81055307

Big Data Journal

The bimonthly "Big Data Research (BDR)" is a journal published by Beijing Xintong Media Co., Ltd. , has been successfully selected into the core journals of China's science and technology, the journal of the China Computer Federation, the Chinese science and technology journals recommended by the China Computer Federation, the classified catalog of high-quality scientific and technological journals in the field of information and communication, and the classified catalog of high-quality scientific and technological journals in the field of computing, and has been rated as the National Science and Technology Journal for many times. The most popular journal in the discipline of "Comprehensive Humanities and Social Sciences" in the academic journal database of the Philosophy and Social Sciences Documentation Center.

707eb5ddf1aea352f5b21911e977790f.jpeg

Follow the WeChat public account of "Big Data" journal to get more content

Guess you like

Origin blog.csdn.net/weixin_45585364/article/details/130818144