Xinghe CaseㅣChina Telecom X Impulse Online: Application Practice of Privacy Computing Based on Intelligent Computing Center

▏Summary

China Telecom is one of the three major operators in China. In response to the new data center form of the national "East Data and West Computation" project, China Telecom introduced a privacy computing platform to realize data rights tracking internally and data sharing transactions externally, revitalizing China Telecom's data resources and computing power resources distributed in different regions of the country, through data openness, computing power output, and China Telecom's algorithm model practice in the field of AI, empower telecom's provincial branches and external government and enterprise customers.
insert image description here

▏Key findings

• Privacy computing technology can realize the security and trustworthiness among all data collaborators, upgrade the intelligent computing center to a credit computing center, and form a new type of trusted information integrating secure storage, trusted computing, high performance and large scale infrastructure;

• When choosing a privacy computing technology route, China Telecom chose a Trusted Execution Environment (TEE) technology combining software and hardware, compatible with CPUs and GPUs, to meet the needs of large High-performance requirements for large-scale data training and reasoning. At the same time, the trust foundation is built on the localized technology route, which can meet the development trend of localized substitution;

• Provincial branches only need to deploy heterogeneous accelerated privacy computing all-in-one machines, pre-installed privacy computing core architecture and data sharing trading platform, and can automatically connect to China Telecom's blockchain infrastructure and data circulation data and computing power scheduling network , to achieve rapid deployment and seamless expansion.

Sharing Expert: Zhou Yueqian, General Manager of Impulse Online Products
Author: Sand Dune Community Analyst Team

01
Case Enterprise

China Telecom Group Co., Ltd. (hereinafter referred to as "China Telecom") was established in 1995. It is a large-scale state-owned telecommunications backbone enterprise and has been selected as one of the world's top 500 companies for many years. By the end of 2021, there will be 107 million fixed telephone users, 372 million mobile phone users, and 170 million broadband users. The total assets of the group company are 907.8 billion yuan, and the total number of employees is more than 400,000.

02Project
Background

In March 2022, the plan report reviewed by the Fifth Session of the Thirteenth National People's Congress proposed to implement the "East Counting and West Counting" project to guide the intensive computing power demand in the east to the west in an orderly manner, so that data elements can flow across domains and open up the " Counting" arteries, weaving a network of national computing power. As one of the three major operators, China Telecom, as the builder and operator of important network and computing power infrastructure, naturally undertakes the tasks and tasks entrusted by "counting from east to west".

For China Telecom itself, "calculating from east to west" means two major internal and external needs:

Internal: The unified scheduling and management of computing power and data is realized through a set of platforms among provincial branches. There is a disparity in computing power and data among the provincial branches of China Telecom. Western provinces and regions such as Inner Mongolia, Qinghai, and Ningxia have lower costs of energy and land resources and more computing power; while business in eastern provinces and regions Busier, with more business data. In this regard, China Telecom launched a digital computing cloud network strategy, which aims to build a network of national computing power and data, and realize the overall scheduling of computing power and data.

External: It can not only meet the needs of provincial branches for government and enterprise customer service scenarios, but also increase the data security compliance protection requirements for provincial branches when they cooperate with external government and enterprises. China Telecom has external government-enterprise partners in various provinces. While using China Telecom's computing power and data, the partners have their own demands for data security. China Telecom also has compliance when exporting data and computing power to the outside world. considerations. With the introduction and implementation of the "Data Security Law" and "Personal Information Protection Law", China Telecom's requirements for provincial branches across the country are becoming more and more stringent.

Based on the above two requirements, China Telecom hopes to introduce a privacy computing platform to realize the security and trustworthiness among all data collaborators, upgrade the intelligent computing center to a credit computing center, and form a secure storage, trusted computing, high-performance, large-scale A new type of trusted information infrastructure with integrated scale.

The privacy computing platform helps provincial branches solve the problem of separation of data, algorithms, and modeling in AI scenarios, including two types of scenarios:

First, the modeling process. The Intelligent Computing Center has data provided by China Telecom and government and enterprise customers, a large number of heterogeneous chip resources, and AI algorithms. The model trainer uses relevant resources for model training, but does not want the output label data, user identity data, etc. to be precipitated. For China Telecom, the algorithms and sample data hosted on the platform do not want to be taken away by the model training party. In the end, only the modeling results are output through the intelligent computing center.

Second, the forecasting process. After the government and enterprise customers complete the model training in the intelligent computing center, they hope to host the model on the intelligent computing center platform, and then provide API interface services for their own business or users. During the user's use, the interface call will involve real business data. In the process of calling the model through the API, the business data will eventually be transmitted to the acceleration chip of the intelligent computing center for calculation, and then output the label. In this process, it is necessary to ensure that the data to be predicted cannot be placed on the disk during the prediction process, so as to realize the separation between the sample data, the model, and the computing power provider during the prediction process.

03Solution
_

In the process of upgrading the intelligent computing center, China Telecom cooperated with Impulse Online and Zhongke Haiguang to jointly promote the application of NationalChip privacy computing and blockchain technology.

Impulse Online is a technology innovation enterprise focusing on data circulation and privacy computing solutions. It has high-tech enterprise certification and is in the leading position in the fields of privacy computing all-in-one computer, trusted execution environment, and blockchain-enhanced privacy computing. It is also the first enterprise in the privacy computing industry to embrace the Xinchuang ecology and fully complete the adaptation.

The cooperation between China Telecom, Impulse Online and Zhongke Haiguang can be traced back to 2020, and has gone through four stages of development:

Phase 1: Core module development. Based on Zhongke Haiguang's self-developed CPU chip and self-developed TEE technology CSV, Impulse Online independently developed a data interconnection and privacy computing platform based on domestic chips, and jointly launched a privacy computing software and hardware integration with Zhongke Haiguang in June 2021. machine products.

Phase 2: Digital Chain Network products. Based on the domestic privacy computing all-in-one computer and blockchain infrastructure, the Telecommunications Research Institute, Impulse Online, and Haiguang Information have jointly developed a digital chain network product that supports data confirmation, pricing, transactions, and privacy calculations, and has been deployed in various provinces within China Telecom. Companies conduct pilots and applications.

Phase 3: Heterogeneity accelerates innovation. Privacy computing technology has bottlenecks in application scenarios and performance scales in the actual production process. TEE technology cannot meet the needs of heterogeneous scenarios. Combined with Haiguang Information’s heterogeneous acceleration chip DCU, Impulse Online and Haiguang Information jointly developed TEE Through the drive and application technology of the heterogeneous acceleration chip, the CSV technology that can only be applied to the CPU chip is directly connected with the Haiguang chip DCU, and the safe and trusted computing environment in the CPU and memory originally protected by TEE is extended to the GPU. , use GPU resources to accelerate privacy computing in TEE, and launch a nationally produced heterogeneous accelerated privacy computing all-in-one machine.

The fourth stage is the application of credit computing. The data chain network product was deployed in Beijing Telecom, and combined with the advanced computing power and algorithm accumulation of Beijing Telecom Intelligent Computing Center, it launched an AI open application platform based on privacy computing to help Beijing Telecom realize the open operation of data assets and AI capabilities. Empower Beijing Telecom's government and enterprise customers to carry out intelligent transformation.

The Trusted Execution Environment (TEE) was the first to formulate detailed standards by CPU chip manufacturers, which cannot cover the GPU resources of the intelligent computing center.

In March 2022, Nvidia took the lead in launching the first H100 GPU chip that can support privacy computing. It provides PVle and NVLink channels to ensure that the communication between the CPU and GPU is fully encrypted, and shields the information supply during the ciphertext transmission process of the CPU and GPU. Ensure the security during data transmission; built-in custom root of trust, to ensure that each GPU chip is independent and cannot be tampered with, once tampered, the root of trust of the chip will no longer be available, ensuring the security of the hardware itself; in addition, support Metric-based trusted boot and GPU remote authentication means that the algorithm running in the CPU can be measured by the data provider, and the data provider can remotely verify whether the H100 chip is compliant, whether it has been tampered with, or whether it has built-in security Faith root. Based on the CUDA ecosystem, user-developed deep learning and machine learning algorithms can run in the TEE-encrypted GPU's trusted execution environment without any changes.

Based on the introduction of TEE-based heterogeneous acceleration solutions by international manufacturers, Impulse Online and Zhongke Haiguang will launch the first domestic GPU chip solution that supports privacy computing in June 2022: through the TEE direct communication between the Haiguang DCU chip and the Haiguang CPU chip, you can The CPU and GPU jointly establish a complete trusted execution environment. The trusted execution environment uses the TEE in the CPU as the core to receive external algorithms, data and models, and uses the computing resources of the GPU through an encrypted channel for reasoning and training. The final result is passed The trusted execution environment in the CPU outputs externally.

This program has the following advantages:

First, during the entire training and reasoning process of machine learning, the data does not need to be placed on the disk, ensuring that the data does not suffer from privacy leakage;

Second, based on the compatibility of Haiguang GPU CUDA, it is guaranteed that AI applications and privacy computing algorithms based on deep learning frameworks such as TensorFlow and PyTorch do not need to be modified;

Third, combined with heterogeneous hardware accelerator cards, it supports AI heterogeneous acceleration;

Fourth, fully localized replacement. At the CPU level, Haiguang CPU replaces Intel CPU, and at the GPU level, Haiguang GPU replaces Nvidia GPU, so as to realize end-to-end national production of software and hardware.

When choosing a privacy computing technology route, China Telecom considered software-based multi-party secure computing, federated learning, or hardware-based trusted execution environment. In the end, China Telecom chose the technical route of trusted execution environment, considering the following factors:

First, the Trusted Execution Environment is more friendly to machine learning, making it easier to expand in the future. Whether it is multi-party secure computing or federated learning, the development language, code, and development framework of the algorithm itself need to be transformed. For example, multi-party secure computing needs to use the operator reconstruction algorithm provided by multi-party secure computing; federated learning needs a framework based on federated learning Rewrite the original machine learning algorithm. Since the trusted execution environment itself is a black box, there is no need to intervene in the algorithm. The original machine learning algorithm and deep learning algorithm can run directly in the trusted execution environment without modification.

Second, the trusted execution environment can support large-scale data of more than 100 million levels. Beijing Telecom hopes to carry the demands of external government and enterprise customers through the intelligent computing center, and each node can support large-scale data of tens of millions or even hundreds of millions. The performance loss of the trusted execution environment can be controlled at 5% to 10%, ensuring that the calculation intensity will not be too much loss. Multi-party secure computing and federated learning have relatively large loss of calculation intensity.

Third, security trust can be transferred to chip manufacturers. The security trustee of multi-party secure computing and federated learning is at the software level, or at the encryption algorithm level, while the security trustee of the trusted execution environment can be passed on to the chip manufacturer. On the one hand, the introduction of domestic chip manufacturers will share risks, on the other hand, the foundation of trust is established on the basis of localized technology routes, which meets the development trend of localized substitution.

At present, the application scenarios of the privacy computing platform in China Telecom that have landed and are being landed are as follows:

First, financial risk control. On the premise of protecting user privacy data, it helps financial institutions train a high-accuracy user financial risk scoring model to predict possible financial defaults and frauds, thereby effectively reducing the bad debt rate and making a good risk warning.

Second, medical and pharmaceutical research. Promote data cooperation between medical institutions and hospitals, and promote drug development and marketing through "real-world data research". Strictly follow the medical data protection norms, realize the complete desensitization of user privacy data and the whole process of privacy calculation.

Third, dual carbon economy. Collect enterprise electricity consumption, energy consumption and production and operation data through smart water meters, smart meters and other Internet of Things IT devices, and model them on the privacy computing platform to promote the storage and transaction of carbon footprint data in the "dual carbon economy", and realize the Privacy protection of energy consumption, production, and business data, and promote the management and support of the green economy by regulatory agencies and financial institutions through data circulation.

Fourth, epidemic prevention and control. On the basis of personal information protection, the joint prevention and control of the epidemic supported by big data will be realized. Through the sharing and circulation of data privacy, joint investigation and precise positioning across agencies will be realized, while the personal privacy of residents will be fully protected.

Fifth, public opinion on public security. Through the integrated analysis of telecommunications business data, Internet behavior data, and social media public opinion, the effective prediction and prevention of public security incidents can be realized, and the multi-party two-way privacy protection in public security intelligence analysis can be realized.

Sixth, inter-provincial telecommunication services. Realize the data interconnection of China Telecom's provincial branches, realize the protection of data ownership and the activation of data assets of each provincial branch, and support the convenient development and efficient interoperability of various cross-provincial telecommunications services.

With the promotion of the Future Intelligent Computing Center and China Telecom's digital chain network platform in various provincial branches, each provincial branch only needs to deploy a heterogeneous accelerated privacy computing all-in-one machine, pre-installed privacy computing core architecture and data sharing trading platform, Automatically cut into China Telecom's blockchain infrastructure and data circulation data and computing power scheduling network to achieve rapid deployment and seamless expansion.

04Value
and effect

Through the digital chain network platform, China Telecom realizes internal data verification and tracking, external data sharing transactions, and builds an AI privacy computing open platform in the intelligent computing center to provide privacy computing API management in a SaaS manner.

At present, the "Digital Chain Network" has been piloted in China Telecom and several provincial branches, carrying tens of thousands of internal and external data transaction circulation and AI modeling, and the total amount of data in operation exceeds tens of billions; supports Beijing Telecom The open business of data output and intelligent computing AI has supported more than 40 AI training scenarios, including data of tens of millions of users and dozens of AI algorithms.

Guess you like

Origin blog.csdn.net/impulseonline/article/details/130691003