Charging and swapping company Kamax uses best practices to improve online application stability at low cost

Author: Kaimax New Energy

Kamax New Energy Technology Co., Ltd. was established on May 16, 2019. The current joint venture shareholders are Volkswagen (China) Investment Co., Ltd., China FAW Co., Ltd., and FAW-Volkswagen Co., Ltd. [Increased capital and shares will be Subject to obtaining appropriate supervisory (including antitrust) approvals], Wanbang Digital Energy Co., Ltd. and Anhui Jianghuai Automobile Group Holdings Co., Ltd., headquartered in Changzhou, Jiangsu. Kaimax integrates the advantages of car companies and charging companies, providing everything from the R&D and manufacturing of charging infrastructure to intelligent interconnection of software, from private charging users to semi-public, public and business users, from the industry source of power supply to the service platform. Terminal experience enables seamless connection between the front and back ends of each business format.

Kaimax comes for China's new generation of consumers. It not only focuses on the charging experience of private electric car owners, but also provides users with high-end quality services to provide users with a new charging experience that is convenient, worry-free, smart and efficient, and starts a journey of enjoying life. At the same time, Camax is committed to providing full-scenario charging services for electric travel. Relying on strong R&D strength, advanced core technology and high-quality services, Camax has also won many awards in the domestic new energy vehicle charging field: In 2021, Camax Won the "Best Operation Service Innovation Award in China's Charging Pile Industry"; in March 2023, Kaimax won the "High-Quality Charging Five-Star Station Award" in one fell swoop, becoming the first batch of outstanding charging operators to receive a five-star rating ( The five-star level is the highest level and highest standard station); in June of the same year, Cammax won the 2023 Top Ten Influential Operator Brand Award in China's charging and swapping industry. Kamax will continue to promote the optimization and innovation of charging network construction speed and charging user journey, and will focus on the research and development of high-power charging equipment and the exploration of new energy services, thereby promoting the green development of deep integration of new energy and new energy vehicles.

Business stability is a big challenge

In 2023, Camex will continue to be committed to user-centered integrated innovation and strive to create smart electric travel. As of the end of May this year, Kamax's charging network covered 180 cities in China, built 1,198 charging stations and 10,490 charging terminals, and accumulated more than 1.96 million users. From lagging behind in construction to being "moderately advanced", the charging pile industry will usher in great development in the next three years, with a market size of hundreds of billions. Nowadays, many cities across the country are constantly upgrading and increasing the installation and utilization of charging piles. With the development of new energy vehicles, the demands of charging user groups are growing rapidly. With the rapid growth of business, Kaimax has stabilized its structure. Sex and usability also pose unprecedented challenges.

Camis adopts the traditional SpringBoot method for application development, and applications are interconnected through HTTP requests. It is the simplicity of the SpringBoot architecture that effectively helps Camis' business and the number of microservices to rapidly expand. However, as the scale of microservices increases, it is gradually discovered that there are some stability and efficiency problems in various stages of application release and operation. As the number of users increases, the corresponding needs also increase, and the business scenarios become more and more complex. At this time, it is difficult to ensure that all scenarios can be covered by relying solely on internal testing. Each application release requires sufficient testing and sufficient grayscale verification. In order to meet the business requirements of rapid iteration, how to conduct multiple iterations in parallel in the development environment at low cost and ensure the stability of each business release has become the key to improving efficiency.

On a large scale, no matter how small a problem is, it can affect the whole body. On the one hand, the traffic we face is random and unpredictable. When the surge of traffic exceeds the service capacity limit, it may slow down the service, increase the load, and cause the service to crash. On the other hand, the distributed microservice architecture is a complex mesh architecture with intricate call links. At this time, when any service (including dependent external services) has unstable factors (such as slow calls or exceptions), it is possible that the upstream The caller is brought down, causing cascading effects. Therefore, in microservice governance, we need some means to prevent these unstable situations.

Faced with the continuous evolution and growth of microservice architecture, Kamax Architecture students also realized the need to introduce microservice governance capabilities to properly manage current microservices, thereby further improving the stability and efficiency of microservices. Similarly, the business still faces demands for rapid development. If the original Spring Boot framework is upgraded to Spring Cloud and various high-level service governance capabilities are introduced, it will be necessary for Kaimax R&D students who are currently facing rapid business development. The investment cost is too high.

Implement microservice architecture upgrade without any sense

Is there a way to implement the governance capabilities of our microservices without changing the code? For example, by implementing full-link grayscale release to avoid stability risks caused by changes; by using current limiting and downgrading capabilities to ensure the stability of the operating state and solving stability risks caused by uncertain traffic; by using authentication capabilities to solve microservices Security risks of intermittent calls. This is like, how can we improve the performance of the aircraft by replacing the engine while the aircraft is running at high speed? More importantly, it should be insensitive to the passengers on our plane.

We further abstract the problem and ask how we can achieve the service governance capabilities of any Java application without changing the code. In this process, we need to ensure a series of realistic factors such as stability, problem diagnosis efficiency, architectural sustainability, and performance.

The exploration of technology always serves the business. We had a further discussion around Kaimax's solution. Can we solve the problem of non-intrusive service management for users by unifying the north-south and east-west traffic management solution?

  1. MSE cloud-native gateway is a next-generation gateway product compatible with the K8s Ingress standard. It combines traffic gateway, microservice gateway and WAF security gateway into one. It has the characteristics of high integration, easy use, easy expansion, and hot update. It opens up multiple service sources such as K8s/Nacos, and improves the application stability of the entire link through means such as lossless online and offline, full-link grayscale, overload protection, fault self-healing, and current limiting degradation.

  2. MSE cloud native gateway adopts a fully managed model. After choosing a cloud native gateway, users only need to care about the specific use of the gateway, and do not need to care about the operation and maintenance, stability, monitoring, alarm and other functions of the cloud native gateway itself. It can be used out of the box. , low threshold for use.

Considering that the cloud native gateway can unify traffic and flow control through routing rules, can Higress be used to implement the management requirements of call traffic between services?

Traffic forwarding and management between services

Now that the idea has been finalized and everyone has evaluated the stability, security and cost, they can quickly start practicing and exploring the solution. The first problem we face is the original way of calling K8s Service through the domain name. How do we forward the traffic to Higress and then forward it to the real corresponding Pod through Higress? And in this process we need to consider the stability of the solution.

  • The immediate way that comes to mind is to modify the Service and Endpoints configurations in K8s and use the coreDNS capability to forward traffic to Higress.
apiVersion: v1
kind: Service
metadata:
 name: provider
spec:
  type: ClusterIP
  clusterIP: None
---
apiVersion: v1
kind: Endpoints
metadata:
  name: provider
spec:
  subsetS:
    ip: ${higress-slb}
    port: 80
  • For the sake of commercial stability, CoreDNS can be replaced by the same type of product privatelinkZone DNS. At the same time, CNAME type DNS records can be configured to batch switch the domain name *.camsnet.com accessed between services to the cloud native gateway.

So far, we have completed that the Order traffic is forwarded to the internal gateway Higress. Next, we need to configure the Higress routing rules to forward the traffic to the real target service.

  • We synchronize the service of the container service to the gateway in the MSE cloud native gateway (Higress commercial version), and configure the corresponding routing rules to implement traffic forwarding.

After the traffic is forwarded through the MSE cloud native gateway, we can do more governance capabilities.

  • In this process, we can directly configure label routing to achieve grayscale publishing capabilities, and then combine it with link tracking to achieve full-link grayscale capabilities.
  • During this process, we can configure JWT authentication rules on the route to achieve secure calls between services.

Observable and full link tracking

By accessing the application real-time monitoring service ARMS - Application Monitoring , Kaimax can realize the monitoring and diagnosis capabilities of the application without modifying a line of code. It can quickly understand the three most critical indicators of the application: response time, throughput, and error rate. At the same time, according to Indicator exceptions use the call chain capability to quickly track the entire microservice.

At the same time, the link tracking capability also provides a technical base support for applications to achieve full-link grayscale.

Full-link traffic label transparent transmission

Use the Tracing Baggage mechanism to transmit the corresponding dyeing identifier throughout the entire link, because most Tracing frameworks support baggage concepts and capabilities, such as: OpenTelemetry, Skywalking, Jaeger, etc. Of course, the ARMS Tracing capability also conforms to this standard. We implement the Higress WASM plug-in and read the x-mse-tag corresponding to the specified transparent transmission key such as x-mse-tag in the Higress outbound Filter from the Baggage at the specified location of the Tracing protocol. The value is inserted into the Http Header for Higress to perform routing. This enables full-link transparent transmission of custom labels.

Once we have the capability of full-link transparent transmission of custom labels, we can build complete full-link grayscale capabilities. What is full-link grayscale?

Under the microservice architecture, there are some requirements developed, involving simultaneous changes to multiple microservices on the microservice call link. Usually each microservice will have a grayscale environment or group to accept grayscale traffic. We hope to use Traffic entering the upstream grayscale environment can also enter the downstream grayscale environment, ensuring that a request is always delivered in the grayscale environment. Even if there are some microservices on this call link that do not have a grayscale environment, these applications request downstream You can still return to the grayscale environment. If a release involves multiple microservices in the link, we can smoothly perform the full-link grayscale release without worrying about the risk of grayscale traffic flowing randomly.

After we implement the full-link transparent transmission of the x-mse-tag label, we can configure label routing rules based on x-mse-tag on the Higress routing to achieve a closed loop of traffic with specific labels in the node that applies a specific version. , thereby realizing the full-link grayscale capability of “traffic swim lanes”.

Traffic protection capability

How can we achieve traffic protection capabilities without modifying the code? Taking common traffic control and circuit breaker degradation as an example, let's first introduce the traffic protection capabilities.

  • flow control

Traffic is very random and unpredictable. It may be calm one second, but there may be a traffic peak the next second (such as the scene at 0:00 on Double Eleven). Each system and service has an upper limit on the capacity it can carry. If the sudden traffic exceeds the system's capacity, it may cause the request to be processed, the accumulated request processing is slow, the CPU/Load soars, and finally leads to system breakdown. Therefore, we need to limit this sudden traffic and ensure that the service is not overwhelmed while processing requests as much as possible. This is flow control.

  • circuit breaker downgrade

Modern microservice architectures are distributed and composed of many services. Different services call each other to form complex call links. The above problems will have an amplified effect in link calls. If a certain link on a complex link is unstable, it may cascade, eventually causing the entire link to become unavailable. Therefore, we need to circuit break and downgrade unstable weakly dependent services and temporarily cut off unstable calls to avoid local instability factors causing an overall avalanche.

Kamax seamlessly realizes traffic protection capabilities by accessing MSE service management traffic protection capabilities (Sentinel Enterprise Edition). Compared with the community version, Sentinel Enterprise Edition has certain advantages in terms of use and functionality.

More exploration and practice

Without changing the code, we can quickly develop complete and systematic microservice governance capabilities. At present, Kamax has implemented a series of capabilities such as full-link grayscale, full-link tracking and observability, and traffic protection based on Higress, allowing Kamax's current architecture to more calmly face the challenges brought by rapidly growing business.

On the other hand, for Higress, the implementation of the Kamax solution has injected fresh ideas into the development of the Higress ecosystem. We are also continuing to improve the ease of use and stability of Higress, hoping to bring more benefits to more companies. Great value.

Qt 6.6 is officially released. The pop-up window on the lottery page of Gome App insults its founder . Ubuntu 23.10 is officially released. You might as well take advantage of Friday to upgrade! RISC-V: not controlled by any single company or country. Ubuntu 23.10 release episode: ISO image was urgently "recalled" due to containing hate speech. Russian companies produce computers and servers based on Loongson processors. ChromeOS is a Linux distribution using Google Desktop Environment 23-year - old PhD student fixes 22-year-old "ghost bug" in Firefox TiDB 7.4 released: officially compatible with MySQL 8.0 Microsoft launches Windows Terminal Canary version
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/3874284/blog/10117925