Practice of Arphic Intelligent IoT Cloud Native Containerized Platform

Author: sekfung, R&D engineer of Shenzhen Wending Chuang Data Technology Co., Ltd., responsible for the development of the company's Internet of Things terminal platform, stability construction, containerized cloud work, good at using GO and Java to develop distributed systems, and continue to pay attention to distributed, Cutting-edge technologies such as cloud native, KubeSphere Contributor, member of KubeSphere Community User Committee Shenzhen Station.

Company Profile

Founded in 2006, Shenzhen Wendingchuang Data Technology Co., Ltd. is the world's leading provider of online identity authentication solutions, focusing on network identity authentication and data security. Provide solutions in applications such as , government, and corporate offices, and become partners of many state-owned commercial banks, national joint-stock banks, city commercial banks, rural commercial banks, provincial and municipal taxation, governments, major CA institutions, and multinational companies, accumulating services Nearly 100 million users, constantly meeting the differentiated needs of customers.

The company has continued to innovate over the years, and has applied for a large number of invention patents, utility model patents and product appearance patents; registered a number of computer software copyrights, and is also a national high-tech enterprise; it has commercial encryption product model certificates, encryption testing certificates, UnionPay certification, ISO9001:2015 international quality management system certification and ISO14001 environmental management system certification; products have passed CE/FCC certification and RoHS certification.

As one of the core members of the International Online Fast Identity Verification Alliance (FIDO), the company is committed to achieving a globally unified online verification standard. We will use this technology to provide people in different regions with the right to enjoy an equal and secure online world.

background introduction

"Wending Chuang Intelligent IoT" is an IoT solution launched by Shenzhen Wending Chuang Data Technology Co., Ltd. for IoT applications. The solution includes a unified IoT service platform, "cloud printer", and "payment cloud speaker" , "Receipt cloud code scanning box" and other products under its umbrella protect the user's data security.

As a hardware provider of TO B solutions, "hardware is the mainstay, software is the supplementary" is the company's long-term development model, so the development, deployment, and architecture design of the server were not paid enough attention in the early stage. Traditional projects stay on a single machine (virtual machine) deployment, and manual packaging and uploading is not only time-consuming and labor-intensive, but also prone to errors, resulting in unavailable services.

Before embracing K8s, we also tried the docker-compose solution. Compared with manual packaging and deployment, docker-compose has indeed brought us some convenience:

  1. ALL-IN-ONE, providing a one-click software deployment solution without the need for cumbersome deployment processes;
  2. Isolate the differences of the host system;
  3. It reduces the version iteration operations of operation and maintenance personnel and reduces the possibility of operational errors.

After launching new products and solutions for the Internet of Things industry, new challenges have been brought to the stability and reliability of services. The existing development model cannot keep up with the iterative needs of the business. Therefore, we urgently need to break Existing framework, explore a new set of software iteration process.

Selection instructions

When we decided to embrace cloud native, we conducted a survey on the container management platforms on the market and found that there are many foreign users of Rancher, and domestic KubeSphere is at the forefront. We have several criteria for selecting a container management platform:

  1. Ecology: Whether the ecology of an open source project is complete is very important, and the surrounding supporting tools can bring excellent user experience and maintainability.
  2. Community activity: Does the official warehouse Issue or the Q&A community respond in a timely manner, and is the code submission active?
  3. Support from commercial companies or foundations: Whether there is support from commercial companies or open source foundations, if it is a personal project, if maintenance is stopped later, it may bring certain risks to users.
  4. Technology stack: Is the technology stack used in line with the team, and is it capable of solving and maintaining it?
  5. User experience: Is there a UI operation interface, is the interface beautiful and smooth to use?
  6. Localization: Have you made some localization optimizations to meet the usage habits of Chinese people?

When researching and selecting models, we found that KubeSphere can fully meet our requirements. The open source KubeKey tool of the KubeSphere team can help us quickly build a KubeSphere cluster, eliminating the cumbersome and complicated deployment process. The OpenELB project provides us with a local cluster load balancing solution.

For problems found during use, you can basically find corresponding solutions in the Chinese Q&A community. KubeSphere's console simplifies the deployment of Kubernetes services, enabling some members of the team who have no experience in using K8s to get started quickly. Colleagues who have used it say it is good.

current structure

At present, microservice design is adopted, the development language is mainly Golang and Java, and gRPC is used for communication between services.

The production environment uses two Tencent Cloud CLBs to access traffic from business and IoT terminals respectively. The entire business service is deployed in the Tencent Cloud TKE cluster, and KubeSphere is used to manage the daily release of the application. The infrastructure of the cluster is based on the principle of "buy it if you can buy it, and build it yourself if you can't buy it" (it's not that the money is not bad, but that small companies have a lot of pressure on operation and maintenance). The reason why the TKE console is not used to manage the release of applications is mainly because the TKE console experience is not very friendly. Another important reason is that the app store has poor support for third-party Helm warehouses and cannot make full use of the Helm ecosystem. .

practice process

hardware resources

Test environment: 10 ESXI virtual machines, self-built Kubernetes cluster.

Production environment: 7 Tencent Cloud CVM nodes, Kubernetes uses Tencent Cloud to host the TKE cluster.

storage solution

Test environment: use 3 ESXI virtual machines as OSD nodes for distributed storage Ceph.

Production environment: For cost and stability considerations, use Tencent Cloud CBS as the K8s storage solution.

minimal installation

Since the production environment and test environment already have some external services, such as Prometheus and Logging, in order to maximize the use of existing resources, minimal installation is adopted in deploying KubepShere.

It is worth mentioning that Monitor is not a pluggable component. Even if the installation is minimized, KubeSphere will still be installed by default. In a production environment, the prometheus-operator installed for TKE monitoring will conflict with it. Prometheus of KubeSphere needs to be closed or manually uninstalled.

DevOps

In the early development stage, version iteration is a very painful thing. Developers manually upload to the server for deployment after compiling and packaging locally. After experiencing various environmental differences and lessons learned from manual operation mistakes, the team made up their minds to change the existing process and decided to build a DevOps system suitable for the team itself.

  1. Continuous Integration (CI): Development performs CI before each code commit to ensure code quality and consistency. This includes running unit tests, static analysis of code, compilation and build processes, etc. When CI fails, development immediately fixes the code and resubmits.
  2. Continuous Delivery (CD): Once the code has passed the CI process, it is delivered to the testing team for testing. The testing team conducts tests to ensure the quality of the product. In the test environment, Coding's custom node is used as the automated build of CI, and the mirror version of KubeSphere is automatically updated through scripts after the CI is passed. In the production environment, due to the release review process, configuration changes, and coordination of various business teams, it is temporarily handed over to the operation and maintenance personnel to manually change the application version for release.
  3. Monitoring and Alerting: Once the code is deployed to production, monitor it. Monitoring helps teams quickly identify and resolve issues, ensuring product availability and performance.

The current DevOps practice mainly solves the following pain points of the team:

  1. Unified compilation environment: It is stipulated that Dockerfile should be written in the project, and the compilation environment in the Docker container should be used for compilation. At the same time, Gitlab Runner should be used to trigger the code submission event instead of the local compilation of the development machine, so as to isolate the differences of each development machine environment.
  2. The release version can be traced back: the early project version management is very arbitrary, and it is named according to the developer's mood. It cannot be quickly located when a problem occurs. For this reason, we agree that when CI is built, the image version needs to meet a specific naming format, such as: ${VERSION}-${ENV}-\${CI_NUMBER}, this naming format can help us quickly locate the version of a certain CI build when there is a problem in a certain environment.
  3. Smooth iteration: Early projects were deployed on a single machine. During iterations, services were often unavailable for a short period of time, resulting in traffic loss. After the container transformation, the Kubernetes probe can be used to update the service smoothly, and when the service is unhealthy, it can be automatically restarted without manual intervention, which greatly improves the availability of the service.
  4. O&M efficiency: Fully leverage the Kubernetes O&M system and cloud-native observability practices to reduce the pressure on multi-service and multi-environment O&M. When a service failure occurs, it can be sensed in time.

Effect

Pipeline configuration

The pipeline uses the Coding solution, which has the following considerations:

  1. Can be deeply integrated with enterprise WeChat. In the CI process, any problems can be notified to the development through IM tools in time;
  2. The supporting tools are perfect, but the official Jenkins is a bit behind the development of cloud native. A series of plug-ins need to be installed to meet the needs, and the configuration process is also very cumbersome.

application deployment

The "Wenting Chuang Intelligent IoT" project has all been released using Helm applications. During the use, KubeSphere is found to be a relatively unfriendly experience. If the application upgrade fails due to a wrong configuration of the yaml file, it will not be able to upgrade again. In the production environment, it is a very bad problem that the application cannot be upgraded. After discovering the bug, the fix code has been submitted to the community and merged with fix: can not re- upgrade helm application in a failed state .

Cluster resource monitoring

KubeSphere's built-in monitoring system satisfies the daily inspection of cluster health by operation and maintenance personnel. At the same time, KubeSphere provides multi-level monitoring. For namespace and service itself, the team uses service monitoring more frequently, so that developers can monitor their own Learn about the resource usage of published services.

future plan

  1. As a new project explored by the company, "Wenting Chuang Intelligent IoT" has fully completed the containerization work and is running on the KubeSphere cluster. In the future, it plans to containerize and migrate the legacy TO B projects to the KubeSphere cluster to improve the maintainability of the project. and usability.
  2. Explore the Service Mesh solution to further improve the smooth release and observability of services.

This article is published by OpenWrite, a multi-post platform for blogging !

Guess you like

Origin blog.csdn.net/zpf17671624050/article/details/130508551