Microservice deployment: blue-green release, rolling release, grayscale release, canary release

foreword

In the process of project iteration, it is inevitable to go online. Going online corresponds to deployment, or redeployment; deployment corresponds to modification, and modification means risk.

1. Blue/Green Deployment

① Definition
Blue-green deployment is to keep the old version, deploy the new version and then test it. After confirming OK, the traffic will be switched to the new version, and then the old version will also be upgraded to the new version.

②Features
Blue-green deployment requires no downtime and less risk.

③Deployment process
Deploy the application of version V1 (initial state),
and all external request traffic will hit this version.

  • Deploying version V2 of the application
    The code of version V2 is different from version V1 (new features, bug fixes, etc.).

  • Switch traffic from version 1 to version 2

  • If the test of version V2 is normal, delete the resources (such as instances) being used by version V1, and then officially use version V2.

④Summary

It is not difficult to find from the process that our application is always online during the deployment process. And during the process of launching the new version, no content of the old version has been modified. During deployment, the state of the old version will not be affected, so the risk is very small. And as long as the old version of the resource is not deleted, in theory, we can roll back to the old version at any time.

⑤ Precautions for blue-green release

When you switch to a blue environment, unfinished business and new business need to be properly handled. If your database backend can't handle it, it will be a troublesome problem.
There may be situations where microservice architecture applications and traditional architecture applications need to be processed at the same time. If the two are not well coordinated in the blue-green deployment, the service may still be stopped.
The problem of synchronous migration/rollback of database and application deployment needs to be considered in advance.
Blue-green deployment requires infrastructure support.
Perform blue-green deployments on non-isolated infrastructure (VM, Docker, etc.), the blue environment and the green environment are at risk of being destroyed.

⑥ Advantages and disadvantages

  • Advantages
    Upgrade switchover and rollback are very fast.
  • Insufficient
    Switching is a full amount, if there is a problem with the V2 version, it will have a direct impact on user experience. Requires twice the machine resources.

⑦ Applicable occasions

Scenarios that have a certain tolerance for user experience.
Machine resources are spare or can be allocated on demand (AWS cloud, or self-built container cloud).

2. Grayscale release

①Gray release definition

Grayscale publishing refers to a publishing method that can smoothly transition between black and white. AB Test is a grayscale release method that allows some users to continue using A and some users to start using B. If users have no objection to B, then gradually expand the scope and migrate all users to B. Grayscale publishing can ensure the stability of the overall system, and problems can be found and adjusted at the initial grayscale to ensure their impact.

灰度发布结构图

②A/B Testing

A/B testing is a method used to test the functional performance of the application, such as usability, popularity, visibility and so on. A/B testing is usually used on the front end of the application, but of course needs to be supported by the back end.

The difference between A/B testing and blue-green publishing is that the purpose of A/B testing is to obtain representative experimental conclusions through scientific experimental design, sample representativeness, traffic segmentation and small traffic testing, and to be sure of the conclusions The purpose of blue-green release is to release new versions of applications safely and stably, and to roll back when necessary.
Blue-green release and canary are release strategies, the goal is to ensure the stability of the newly launched system, and the focus is on the bugs and hidden dangers of the new system.
A/B testing is an effect test. There are multiple versions of the service at the same time. These services have been tested enough and have reached the online standard. method of publication).

③Canary Deployment

What we usually call canary deployment is also a way of grayscale release. When the original version is available, a new version of the application is deployed as a "canary" server to test the performance and performance of the new version to ensure When the overall system is stable, find and adjust problems as soon as possible.

Canaries in the mines: In the 17th century, British mine workers discovered that canaries were sensitive to the gas known as methane. Even if there is an extremely small amount of gas in the air, the canary will stop singing; when the gas content exceeds a certain limit, although the dull human beings are not aware of it, the canary has already died of poison. At that time, under the conditions of relatively simple mining equipment, workers would bring a canary every time they went down the mine as a gas detection indicator, so that they could evacuate in case of danger.

Grayscale release/canary release consists of the following steps:

  • Artifacts for each stage of deployment are ready, including: build artifacts, test scripts, configuration files, and deployment manifest files.
  • Remove the "canary" server from the load balancer list.
  • Upgrade the "canary" application (drain the original traffic and deploy it).
  • Automate testing of applications.
  • Add the "canary" server back to the load balancer list (connectivity and health check).
  • If the "canary" online use test is successful, upgrade the remaining other servers (otherwise rollback).

In addition, gray-scale publishing can also set routing weights, and dynamically adjust different weights to verify new and old versions. For example, weights can be implemented in Istio to verify and release new and old versions.

④ Advantages and disadvantages

  • Advantages
    The impact on user experience is small, and problems in the grayscale release process only affect a small number of users.
  • Insufficient
    Publishing is not automated enough and can cause service interruptions during the release.

3. Rolling Update Deployment

Further optimization and improvement on the basis of canary release is a release method with a high degree of automation and a relatively smooth user experience. It is currently the mainstream release method adopted by mature technical organizations.

① Definition

Rolling release: Generally, one or more servers are taken out to stop services, perform updates, and put them back into use. Repeatedly, until all instances in the cluster are updated to the new version.

②Features

Compared with blue-green deployment, this deployment method is more resource-efficient-it does not need to run two clusters and twice the number of instances. We can deploy in part, for example, only take out 20% of the cluster for upgrade each time.

③Deployment process

  • Rolling release generally starts with 1 server, or a small percentage, such as 2% servers, mainly for traffic verification, similar to canary (Canary) testing.
  • Rolling release requires more complex release tools and intelligent LB, which supports smooth version replacement and traffic pull in and out.
  • For each release, the old version V1 traffic is first removed from the LB, then the old version is cleared, the new version V2 is released, and the LB traffic is connected to the new version. This ensures that the user experience is not affected as much as possible.
  • A rolling release generally consists of several release batches, and the number of each batch is generally configurable (can be defined through the release template). For example, 1 unit (canary) in the first batch, 10% in the second batch, 50% in the third batch, and 100% in the fourth batch. There is an observation interval between each batch, and manual verification or monitoring feedback ensures that there is no problem before sending the next batch, so the overall rolling release process is relatively slow (the time of canary is generally longer than that of subsequent batches longer, such as 10 minutes for the canary and 2 minutes for subsequent intervals).
  • Rollback is the inverse process of release. The new version traffic is removed from LB, the new version is cleared, the old version is sent, and the LB traffic is connected to the old version. Like the release process, the rollback process is generally slower.

④ Advantages and disadvantages

  • Advantages
    The user experience is less affected and the experience is smoother.
  • Cons
    -- Slow release and rollback times.
    -- Publishing tools are more complicated, and LB needs smooth traffic removal and pull capabilities.

4. Release of function switch

Using the function switch (Feature Flag/Toggle/Switch) in the code to control the release logic generally does not require the cooperation of complex release tools and smart LBs, which is a relatively low-cost and simple release method. This method also supports the concept of modern DevOps, and developers can flexibly customize and release by themselves. The principle of the function switch is shown in the figure below:

① Deployment process

  • The function switch release requires a service support such as a configuration center or a switch center, such as Ctrip’s Apollo configuration center or the open source FF4J, which all support switch release. There are also dedicated function switching SaaS services in the industry, such as LaunchDarkly. Through the configuration center, operation and maintenance or R&D personnel can dynamically configure the value of the function switch during runtime. Of course, function switch publishing is only one usage scenario of the configuration center, and the configuration center can also support many other dynamic configuration scenarios.
  • The function switch service generally provides a client-side SDK, which is convenient for developers to integrate. During the running period, the client SDK will synchronize the latest switch value, and the technology can be implemented in a push mode, a pull mode, or a combination of push and pull.
  • The new function (V2 new feature) and the old function (V1 old feature) live in the same set of codes, and the new function is hidden behind the switch. If the switch is not turned on, the old code logic will be used, and if the switch is turned on, the new code logic will be used. Technical implementation can be understood as a simple if/else logic.
  • After the application is launched, the switch is not turned on first, and then the operation and maintenance or R&D personnel open the new function through the switch center. After traffic verification, the new function has no problems, and the release is completed; if there is a problem, the old function logic can be switched back through the switch center at any time.

②Advantages and disadvantages

  • Advantages
    Upgrade switchover and rollback are very fast.
    Compared with complex publishing tools, the implementation is relatively simple and the cost is relatively low.
    R&D can flexibly customize release logic and support DevOps self-release.
  • Insufficient
    Switching is a full amount, if there is a problem with the V2 version, it will have a direct impact on user experience.
    If there is an intrusion into the code, the code logic will become complicated, and the logic of the old version needs to be cleaned up regularly, which will increase the maintenance cost.

Reference: https://cloud.tencent.com/developer/article/1449209

Guess you like

Origin blog.csdn.net/lovedingd/article/details/130561877