The service grid implementation cycle is shortened by 50%, Lixun Logistics’ cloud native application management practice based on Alibaba Cloud ACK and ASM

Author: Wang Xining, Liu Qiang, Hua Xiang

Company Profile

Lixun Logistics is a service provider under Belle that focuses on the fashion industry and provides enterprises with professional logistics and supply chain solutions. Its product services mainly include urban floor distribution, integrated warehouse distribution, trunk transportation and customized solutions. Through the self-developed intelligent logistics management platform, we comprehensively support the intensive development of corporate cooperation. At present, Lixun Logistics has 70+ omni-channel physical cloud warehouses and 6 central e-commerce warehouses across the country, with a total area of ​​1 million+ square meters. Its services cover 300+ cities and 3000+ business districts, and serve many well-known fashion brands. Its brand stores provide omni-channel delivery services.

In order to reduce operation and maintenance costs in all aspects of the business and improve logistics service efficiency, starting from August 2021, Lixun Logistics began to complete its own process from IDC self-construction to full cloud nativeization on Alibaba Cloud. Among them, Alibaba Cloud Container Image Warehouse Enterprise Edition ACR EE and Alibaba Cloud Container Service ACK are used as the container product management and scheduling platform, and Alibaba Cloud Service Grid ASM is used as the distributed management platform for cloud native application services. Through the service grid Service governance and traffic control functions enable efficient deployment and expansion of applications.

Through this article, Li Xun Logistics Architect Liu Qiang shared his practical experience on how to accelerate the cloud-native process of enterprise business based on Alibaba Cloud Service Grid ASM.

Business pain points

In the context of technological architecture transformation and rapid business development, Lixun Logistics needs to interact with multiple business units and partners such as supply chain support platforms and R&D platforms. Its business system is diversified and open. In the current situation where the market environment and consumer demands are changing rapidly, we prefer to focus on the research and development of our core business. Including the following business problems and pain points that need to be solved:

  • It is difficult to iterate application versions.
    In the face of rapidly changing customer and business requirements, more and more application functions are relied on. The more complex the business, the higher the degree of code coupling, and the new feature launch cycle is gradually lengthening, making it increasingly difficult to iterate application versions.

  • Heterogeneous systems cannot be managed uniformly.
    The multi-language, multi-protocol, and multi-framework status of enterprise-level IT systems creates a dilemma for unified service integration and service governance. At the same time, due to the complexity of IT system deployment infrastructure, the technical difficulties of supporting cross-platform and multiple Kubernetes clusters need to be solved urgently.

  • There are certain difficulties in building a unified cloud-native application service research and development platform.
    The open source microservice framework represented by Spring Cloud has become the mainstream microservice scaffolding in the industry. These frameworks already have basic microservice capabilities such as service registration discovery and health check. However, they still need to be applied in the face of high-level service governance issues such as service access security control, service flow control, routing control, and grayscale release involved in enterprise-level applications. Integrate a large number of third-party open source frameworks yourself. This makes the design and development of cloud native application service business applications have a high technical threshold, which brings certain difficulties for enterprises to build a unified cloud native application service R&D platform.

  • Complex operation and maintenance system
    The existing operation and maintenance system has a certain degree of complexity. Compared with the service grid that provides a series of functions around traffic management, security, and observability, the current operation and maintenance system for large-scale management of application services There are challenges.

solution

As the industry's first fully managed Istio-compatible service mesh product, ASM has maintained industry leadership and consistency in community and industry development in terms of architecture from the beginning. The control plane components are hosted on the Alibaba Cloud side and are connected to the data plane side. User clusters are independent, maintaining high availability deployment and stability. The ASM product is customized and implemented based on the community open source Istio. It provides component capabilities to support refined traffic management and security management on the managed control plane side. Through the hosting mode, the life cycle management of Istio components and the managed K8s cluster is decoupled, making the architecture more flexible and improving the scalability of the system.

picture
Alibaba Cloud Service Grid ASM Architecture Diagram

In becoming an infrastructure for unified management of multiple types of computing services, managed service grid ASM provides unified traffic management capabilities, unified service security capabilities, unified service observability capabilities, and unified agent visibility based on WebAssembly. Expand capabilities to build enterprise-level capabilities.

In addition to the big data analysis system, Lixun Logistics' current system has been fully integrated into the service grid system, including the use of the following capabilities:

picture
Lixun Technology Business Application Deployment Architecture Diagram

  • Certification and authentication system

The client initiates a service request, and the backend needs to verify the legitimacy of the user request. For example, determine whether the user request has the resource access permission. After the authentication is passed, some information that is not included in the original request needs to be added to the returned result. For example, after the user passes the authentication, the business version number, user ID, etc. are added to the header.

For the above business scenarios, ASM provides custom authorization services. Add an authentication process on the ASM gateway to ensure that key services can only be accessed if authorized.

For details, please refer to: https://help.aliyun.com/document_detail/446628.html?spm=a2c4g.476420.0.0.25005e37CV8ta8

In addition, Alibaba Cloud Service Grid ASM product provides simple and easy-to-use identity definition for each workload under the service grid, and provides customized mechanisms to expand the identity construction system according to specific scenarios. It is also compatible with the community SPIFFE standard; and provides Policy-based trust engine serves as the key core for building zero trust.

  • Combination and Migration from Microservice Framework to Service Grid

Under the original system 2.0 system, when calling between application microservices, the IP and port of the instance are obtained through the service registration center Eureka. That is to say, Eureka registers the service instance into the registry and performs load balancing through the Eureka client, so that the service Available service instances can be dynamically selected for connection as needed.

After switching to Kubernetes and the service grid system, replace the Spring Cloud module functions in the microservice application, including service registration and discovery, and switch to the Kubernetes system, which is implemented based on K8s service+Core DNS. In other words, since Kubernetes has When the data between services and Endpoints is maintained during Pod scheduling, there is no need to develop a separate name service mechanism for service registration. It is the best practice to uniformly converge to Kubernetes' service registration and discovery.

After the above simple transformation, services developed in various languages ​​and various development frameworks can be managed uniformly through ASM as long as the business protocols are connected, they can access each other, and the access protocols can be managed by the grid.

Unified service management rules can be configured on the control plane. On the data surface, the Sidecar proxy is used uniformly for service discovery, load balancing and other traffic, security, observability and other related capabilities. Of course, during the migration process, the registration center of the original microservice framework can also be retained in stages, so that ASM and other service discovery can be used in combination with the intermediate state, so that services in the grid can access the services of the microservice registration center.

For details, please see: https://help.aliyun.com/document_detail/2527072.html

  • Full link grayscale

While the production environment is running normally, grayscale upgrades are started for some application services. For example, the B and D applications in the figure are grayscaled. Without modifying the application logic, Service Mesh technology can be used to implement the grayscale upgrade based on the request source. Or request header information to dynamically route to different versions of services. For example, when the request header contains tag1, application A will call grayscale version B, but C does not have a grayscale version, and the system will automatically fallback to the original version.

picture

Full-link grayscale diagram When you need to implement full-link grayscale publishing between multiple services, you can configure TrafficLabel to identify traffic characteristics and divide gateway inlet traffic into normal traffic and grayscale traffic. Grayscale traffic characteristics will be passed between the various services that the request call chain passes through, thereby realizing full-link grayscale release.

For details, please see: https://help.aliyun.com/document_detail/377563.html?spm=a2c4g.2362128.0.0.50945dfcNA9kUg

  • Unified observability system and linkage analysis

Alibaba Cloud Service Grid ASM provides a grid observability center to conduct unified observability system and linkage analysis, which is divided into three dimensions.

The first is log analysis. Through the collection and analysis of AccessLog on the data plane, especially the analysis of the entry gateway log, the traffic situation of service requests, status code ratio, etc. can be analyzed, so that the calls between these services can be further optimized.

The second observability capability is distributed tracing capability. It provides developers of distributed applications with tools such as complete call link restoration, call request volume statistics, link topology, application dependency analysis, etc., which can help developers quickly analyze and diagnose performance bottlenecks under distributed application architecture, and improve micro-processing. Development diagnostic efficiency in the service era.

The third observability capability is monitoring capability. Generate a set of service metrics based on the four dimensions of monitoring (latency, traffic, errors, and saturation) to understand and monitor the behavior of services in the grid.

In addition, a mesh topology is provided to provide instant insights into the behavior of the service mesh. In addition to powerful grid traffic topology visualization, it also provides a playback function to select traffic in past time periods.

picture

  • Envoy Filter expansion capabilities

ASM provides an out-of-the-box EnvoyFilter plug-in market to maintain the full life cycle management of extension plug-ins.

Based on the built-in template, users only need to perform simple configuration according to the corresponding parameter requirements to deploy the corresponding EnvoyFilter plug-in. Through such a mechanism, the data plane becomes a more scalable plug-in collection capability.

picture

Product advantages

As a basic core technology used to manage application service communication, service mesh brings safe, reliable, fast, application-agnostic traffic routing, security, and observability capabilities to calls between application services.

It can be seen that Alibaba Cloud Service Grid ASM brings important advantages to cloud native application management, which are summarized in the following six aspects.

One of the advantages: unified management of heterogeneous services

  • Multi-language and multi-framework interoperability and governance, dual-mode architecture integrated with traditional microservice system
  • Refined multi-protocol traffic control, unified management of east-west and north-south traffic
  • Automated service discovery for unified heterogeneous computing infrastructure

Advantage 2: End-to-end observability

  • Integrated intelligent operation and maintenance that integrates logging, monitoring and tracking
  • Intuitive and easy-to-use visual grid topology, health identification system based on color identification
  • Built-in best practices, self-service grid diagnostics

Advantage Three: Zero Trust Security

  • Globally unique workload identity (Identity), end-to-end mTLS encryption, attribute-based access control (ABAC)
  • One-stop configuration of JWT authentication, access to custom external authorization system, and external OIDC authentication and authentication identity management system
  • OPA declarative policy engine, complete audit history and insights based on dashboard

Advantage 4: Performance optimization combining software and hardware

  • The first service mesh platform based on Intel Multi-Buffer technology to improve TLS encryption and decryption
  • NFD automatically detects hardware features, adaptively supports features such as AVX instruction set, QAT acceleration, etc.
  • The first batch to pass the trusted cloud service grid platform and performance evaluation advanced certification

Advantage 5: SLO-driven application elasticity

  • Service Level Objective (SLO) Policy
  • Automatic elastic scaling of application services based on observability data
  • Automatic switching and fault recovery under multi-cluster traffic bursts

Advantage 6: Out-of-the-box expansion & ecological compatibility

  • Out-of-the-box EnvoyFilter plug-in market, WebAssembly plug-in full life cycle management
  • Unified integration with Proxyless mode, supporting SDK and kernel eBPF methods
  • Compatible with Istio ecosystem, supports Serverless/Knative, AI Serving/KServe

Construction results

After applying ASM, Alibaba Cloud Service Grid effectively solves the complex operation and maintenance problems of application link calls in the case of multi-language technology stacks, and solves the ease of use problem when used with other products on the cloud, thereby improving operation and maintenance efficiency . Improved by 40%. At the same time, with the rich enterprise-level capabilities and complete observability capabilities provided by ASM, the implementation cycle of building a service grid is shortened by 50%.

reference:

[1]  Operation and maintenance costs are reduced by 50%. How does Lixun Logistics cope with the challenge of large-scale container image management?

[2]  How to build a high-performance service grid in Sidecarless mode

Guess you like

Origin blog.csdn.net/alisystemsoftware/article/details/132542713