A brief analysis of microservice architecture

Preface

In the early days of the Internet and the early days of company establishment, centralized architecture (monolith architecture) was generally used, and all services and data storage were deployed on one machine.

Usually, the performance and hardware of the machine are more stringent, and minicomputers such as HP, SUN, and IBM are chosen, but their prices are relatively expensive. Secondly, the failure is a single point of failure, which will cause a relatively large impact.

In order to reduce the impact of single points of failure, reduce the coupling of services and data storage, and improve development efficiency, the centralized architecture gradually evolved into a layered architecture (SOA).

Split according to business and functional dimensions and then deployed to different machines. Different services and data storage generally have different configuration requirements for machines.

The core of the layered architecture lies in "separation", so that each split service and data storage is independent and can be logically merged together for management.

When solving one problem, new problems will be introduced. The hierarchical architecture will lengthen the request path, decrease the stability of the system, make positioning problems more complicated, and increase the operation and maintenance costs.

In order to reduce operation and maintenance costs, rapid project iteration, and continuous project delivery, the layered architecture evolved into a microservice architecture.

Structure according to business latitude, functional latitude, and personnel organization latitude.

Microservice architecture is not a silver bullet, and its introduction also brings new problems. Each architecture has advantages and disadvantages. As long as it can solve the pain points in the company's current business, it is a good architecture.

What are microservices

Personally, I think microservices are a kind of personnel organization and a set of distributed architecture ideas.

Microservices architecture consists of a series of small autonomous services. Each service is independent and implements a single business capability. Here are some characteristics of microservices:

  • Loose coupling: Services are small, independent and loosely coupled.
  • Small, dedicated teams: Each service is a separate code (project) managed by small development teams.
  • Independent deployment: Services can be deployed independently so that updates to the service do not require rebuilding and redeploying the entire application.
  • Fault isolation: The service is responsible for persisting the data or external state of its own service, and a small team is responsible for the entire service.
  • API communication: Services use APIs to communicate with each other, hiding the internal implementation details of each service.
  • Mixed technology stack: Services can be developed using different technology stacks and ultimately combined through API communication.
  • Fine-grained scaling: Services can be scaled independently, using different machine configurations.

Microservice components

API gateway

The API gateway is the main entrance for the client, and client requests do not directly call services.

They first call the API gateway, and then the gateway forwards the call to the corresponding backend service.

It effectively prevents internal sensitive information from being exposed to external clients, adds a security line of defense to the service layer, puts universal functions into the API gateway layer, and sets standards for communication between services.

API gateway provides users with complete API hosting services, providing identity authentication, permission management, request proxy forwarding, service/API level flow control (implementing dynamic flow control is difficult), service/API level circuit breaker to ensure API security. Reduce API open risks. Provides log collection (including APM logs), API level monitoring, and AB Test functions.

Service registration discovery

There are many independent and autonomous services in the microservice architecture, and microservices require the construction of a set of CICD (Jenkins, K8s) basic components.

It would be much simpler if the service pre-defined which server it runs on, we can just hardcode it.

However, it is almost impossible to fully utilize server resources by pre-defining the running server.

It is also difficult to automatically scale services and recover after failures.

In order to better automatically deploy services to servers with relatively idle resources and make full use of machine resources, we need a place where we can add IP addresses, ports, and service names.

Store this data somewhere and be able to find it by other services, and share the data with all services. This is what we call service registration and discovery (highly available configuration center + service discovery proxy).

Call link tracking

Since the microservice architecture also conforms to the layered architecture, it also has the confusion of long request links and complicated locating after problems occur. Its implementation can be intrusive to business methods or non-intrusive (to be done by the API gateway layer).

If we can record the link information (usually a directed acyclic graph) of each interface from the time the user initiates a request to the normal end/abnormal interruption of the entire call.

In this way, each interface can be drawn into a call map at the API level/service level/entire system level. We only need to provide the URL of an interface to know its upstream and downstream dependent services, and we can also intuitively see from the call link information which link in the link the problem is.

CI/CD

Continuous integration and continuous deployment are one of the main advantages of microservice architecture. The release cycle is greatly shortened. Without a good CI/CD process, we will not be able to realize the flexibility, independent deployment, fine-grained expansion and other characteristics promised by microservices.

The code repository will choose to use gitlab to constrain the project's branch usage specifications.

Several branches with unique functions can be fixed to trigger git hooks, such as release, develop, and master branches, which correspond to different deployment environments.

We use Jenkins for continuous integration code building, which can output detailed reports for each build. Jenkins sets a hook for successful build and notifies the relevant CI system when the component is successful. The CI system can perform the next step of deployment and automated testing. The Kubernetes we use in CD can be built based on public cloud/private cloud.

Log Center

Logs are divided according to the request data flow direction: gateway logs (access, error, info, etc.), business logs (request third-party, db, etc. logs), machine logs (cpu, memory, process, number of connections...), etc. These are It is easy to produce abnormal links.

Logs need to be collected and aggregated together. For example, call link tracking can be considered an aggregation application based on business logs.

The storage used for aggregating logs can choose Es, Alibaba Cloud's SLS, or HDFS.

Choose different storage and combine multiple storages for different orders of magnitude, different data types (structured, semi-structured, unstructured), different query magnitudes, and maintenance costs. Due to the large volume of log data, logs for a period of time are generally saved.

Use computing engines (spark, flink) to filter the original logs and permanently store meaningful (error, payment-related, important service) logs. Grafana is generally used to visualize data, which supports multiple data sources, provides diverse templates, and rich charts.

Monitoring alarm

We generally do monitoring and alarming based on the log platform. In addition, there is also business data and monitoring. Business data refers to data modeling of important service data, and year-on-year and segment strategies to analyze whether the trend of the data fluctuates beyond the threshold.

The objects of monitoring data include gateway logs, business logs/data, and machines. The thresholds of monitoring indicators can be used to obtain reference values ​​through full-link stress testing.

In the early stage, as many alarms as possible (there are false alarms), the thresholds are constantly adjusted, and the alarms are converged to a certain extent. Monitoring tools include Zabbix, Open-flcon, Prometheus, and third-party cloud services, which can be used in combination or individually.

The alarm can be generated using the monitoring tool itself, or it can be a self-developed alarm tool. Alarm methods include text messages, emails, phone calls, and internal IM.

Monitoring business data requires data modeling, analysis, and conclusion. From a data perspective, business data is of INFO type and cannot directly contain ERROR and Waring like gateway logs and business logs.

The analysis and modeling of business data will use statsmodels in Python, pandas, numpy, scipy for data analysis, seaborn for data visualization, and matplotlib.pylab for data fitting.

FAQ

Q1: Do microservices need to implement all components?

Whether to implement them all depends on the company's investment in infrastructure, the ability of technical personnel, time costs, and operation and maintenance costs.

The suggestion for implementing microservice components here is to register the service discovery first, and other components can be introduced slowly.

Q2: How to split the services in microservices?

Based on the existing architecture and the evolved architecture, sort out the levels of services (split according to levels), prohibit business calls from each other (introducing MQ decoupling), separate common functions, and split stateful services (split them into The state is placed in the storage layer).

The purpose of splitting is for rapid iteration, continuous deployment, and rapid scaling and scaling.

Q3: What are the design principles of microservices?

High cohesion and low coupling (single responsibility, lightweight communication method, contract between services)

High degree of autonomy (independent development, deployment, release, process isolation)

Business-centric (each service represents a specific business, responds quickly to business changes, and organizes small teams around the business)

Flexible design (fault tolerance, service degradation)

Logging and monitoring (log aggregation, monitoring and alarming)

Automation (continuous integration, continuous delivery)

As a bonus for this article, you can receive free Linux C/C++ development learning materials package, technical videos/codes, and 1,000 interview questions from major manufacturers, including (C++ basics, network programming, database, middleware, back-end development/audio and video development/Qt development /Game Development/Linuxn kernel and other advanced learning materials and the best learning roadmap) ↓↓↓↓↓↓See below↓↓Click at the bottom of the article to get it for free↓↓

Guess you like

Origin blog.csdn.net/m0_60259116/article/details/134816952