Service Mesh: Istio Architecture

What is a service mesh


The term Service Mesh is often used to describe the network of microservices that make up these applications and the interactions between them. As they grow in size and complexity, service meshes become increasingly difficult to understand and manage.

Its requirements include service discovery, load balancing, fault recovery, metrics collection and monitoring, and generally more complex O&M requirements, such as A/B testing, canary releases, throttling, access control, and end-to-end authentication.

Each box is a pod, the green part is the application code, and the blue part is the sidecar.

In this way, the service call between applications becomes a network called between sidecar and sidecar. The so-called mesh is that the network call of the sidecar turns the entire platform into a mesh structure.

The service mesh has various capabilities, such as service discovery, load balancing, fault recovery, and so on.

Why use Istio?


  • Automatic load balancing of HTTP, gRPC, WebSocket and TCP traffic.
  • Fine-grained control over traffic behavior is possible through rich routing rules, retries, failover, and fault injection. (With application capabilities, istio is the control plane. The reverse proxy software envoy, the process running in the sidecar is called envoy, envoy itself is an application process, and the application process can have many advanced capabilities and fault handling capabilities, because adding any capabilities does not require modifying the kernel but modifying the application code, so it can perform a lot of fine-grained flow control)
  • Pluggable policy layer and configuration API with support for access control, rate limiting and quotas. (envoy can provide flexible plug-ins, access control and rate configuration, so that plug-ins can be provided for flexible expansion)
  • Automatic metrics, logging, and tracking of all traffic to and from the cluster ingress and egress. (All of these can be reported through envoy, and based on the service grid, security can be guaranteed, including authentication and authentication)
  • Enable secure inter-service communication across clusters with strong identity-based authentication and authorization.

Overview of Istio features


 When we talk about the service grid, we are not just talking about the service grid itself. To do advanced traffic management is not only within the cluster, but also how the traffic of external clients comes in.

It involves ingress traffic. After the external traffic comes in, the microservices inside the cluster jump, which is the call between services. It also involves internal services calling external services that can be accessed through a unified external proxy. This is called egress gateway.

Although it is a service grid, it solves the unified control of inbound traffic, service grid traffic, and outbound traffic.

In this way, the control of all traffic is unified.

Before k8, the ingress object was provided to help you manage inbound traffic, but ingress has many restrictions, which can be realized for you based on istio, so even if you do inbound traffic access, you can fully meet your demands by doing just this.

istio is a combination that integrates unified management and control of inbound traffic, mesh traffic, and outbound traffic.

traffic management


istio can do fine-grained traffic management based on simple configuration. Based on fine-grained traffic management, you can do A/B testing, canary deployment, etc. to support business scenarios.

connect

· Through simple rule configuration and traffic routing, traffic and API calls between services can be controlled. Istio simplifies configuration of service-level attributes like circuit breakers, timeouts, and retries, and makes it easy to set up important tasks like A/B testing, canary rollouts, and staged rollouts with percentage-based traffic splitting.

control

· With better traffic flow visibility and out-of-the-box failover capabilities, problems can be detected before they arise, making calls more reliable and making your network stronger—no matter what conditions you face.

You can really understand the direction of traffic, and some out-of-the-box fault recovery can help you control traffic.

Safety


The current architecture should not only be reinforced at the border of the cluster, but you have to assume that the entire cluster is insecure. Therefore, when any application provides services, it needs to do security reinforcement, authentication, authentication and encrypted transmission. Istio naturally has this ability. It allows for protocol upgrades, identity verification and authentication based on the tls protocol.

· Allows developers to focus on application-level security. Istio provides an underlying secure communication channel and manages authentication, authorization, and encryption of service communications at scale. With Istio, service communication is secured by default, allowing policies to be enforced consistently across multiple protocols and runtimes—all with little or no application changes.

● While Istio is platform-agnostic, its benefits are even greater when combined with Kubernetes (or infrastructure) network policies, including the ability to secure inter-Pod or inter-service communication at the network and application layers.

Observability


Istio generates the following types of telemetry data to provide observability across the service mesh:

  • Metrics: Istio generates a series of service metrics based on 4 monitored golden indicators (latency, traffic, errors, saturation). Istio also provides more detailed metrics for the mesh control plane. In addition, a set of default grid monitoring dashboards based on these metrics are provided.
  • Distributed tracing: Istio generates a distributed tracing span for each service, and operation and maintenance personnel can understand the dependencies and invocation processes of services in the grid.
  • Access logs: As traffic flows into services in the mesh, Istio can generate a complete record of each request, including source and destination metadata. This information enables operators to control the audit of service behavior down to the level of individual workload instances.

All of these features allow for more efficient setting, monitoring and enforcement of SLOs on services, detecting and fixing problems quickly and efficiently.

The inbound traffic, outbound traffic, and mesh traffic all pass through the sidecar. All sidecars know how the network traffic of the entire cluster travels, including delay error codes, source and destination port IPs, and other information. In this way, you can clearly capture the call relationship between different services in the current cluster, the health status of traffic calls, and the distribution of error codes. You can have a very clear view to see the health status of the entire cluster.

Istio Architecture Evolution


data plane

  • It consists of a group of intelligent agents (Envoy) deployed in the form of Sidecar. These proxies can mediate and control all network communication between microservices and Mixer. (The data plane of istio itself is a reverse proxy software called envoy)

control plane

  • Responsible for managing and configuring proxies to route traffic . Additionally, the control plane configures Mixer to enforce policies and collect telemetry data. (istio is often a combination of data plane and control plane, envoy is the data plane, and istio is the data plane)

architecture evolution

  • The structure of istio itself has undergone major adjustments from microservices back to a single structure. This adjustment has changed from a distributed architecture to a single structure. In the early days, istio divided its functions into three major modules. The first is the traffic management module, which is completed by the piolt module. The second module is telemetry and policy, which are policy control and status monitoring. These two capabilities are placed in the muxer. The last one is the security-related package in citadel. For some other pods, there are more than a dozen pods in the entire application. Each component has a single responsibility. These components are not so logically dependent. Each of them performs their own duties. It is a perfect architecture. This architecture encountered some situations in the later stage. To do non-destructive upgrades, there are many difficulties in troubleshooting. Although these components were separated, we later found that the life cycle of these subsequent components seemed to be the same, but in fact it was a bit over-designed. Should the final functions be integrated, so there will be major architectural changes)

On the control plane, all capabilities, traffic control, security, etc. are put into istiod, and the mixer component is gone.

The inbound traffic enters the proxy, and the proxy enters the cluster. In many cases, a gateway is provided, and the gateway is also an envoy proxy. When this traffic enters the cluster, it jumps in the proxy. The proxy will hijack the traffic and then send it to the application. The calls between the application and the application are all mesh traffic through the proxy.

Design goals


 maximize transparency

  • Istio automatically injects itself into all network paths between services, and operators and developers can benefit from it at little cost.
  • Istio uses sidecar proxies to capture traffic and, where possible, automatically programs the network layer to route traffic through these proxies without requiring any changes to deployed application code.
  • In Kubernetes, proxies are injected into pods to capture traffic by writing iptables rules. After injecting the sidecar proxy into the pod and modifying the routing rules, Istio can mediate all traffic.
  • All components and APIs must be designed with performance and scale in mind.

increase

  • The greatest need is expected to extend the policy system, integrate other sources of policy and control, and propagate grid behavior signals to other systems for analysis. The policy runtime supports standard extension mechanisms for plugging into other services. (For example, how to push the status of the endpoint into Envoy in increments, for example, if a pod changes, many Envoy configurations need to be refreshed, and the full amount cannot be refreshed every time, and the bandwidth occupied by network traffic is also too large, so it is necessary to calculate the quality and only send increments)

portability

  • Migrating an Istio-based service to a new environment should be a breeze, and it should also be possible to deploy a service to multiple environments simultaneously using Istio (e.g., for redundant deployments on multiple clouds).

policy consistency

  • Application of policies enables full control over inter-grid behavior in inter-service API calls, but is equally important for resources that do not need to be expressed at the API level.
  • Therefore, the policy system is maintained as a distinct service, with its own API, rather than putting it into a proxy/sidecar, which allows services to integrate directly with it as needed.

Guess you like

Origin blog.csdn.net/qq_34556414/article/details/131379358#comments_27383663