What is continuous integration (CI)/continuous deployment (CD)?

Insert picture description here

The terms Continuous Integration (CI) and Continuous Delivery (CD) are often mentioned in software development. But what do they really mean?

When talking about software development, the terms Continuous Integration (CI) and Continuous Delivery (CD) are often mentioned. But what do they really mean? In this article, I will explain the meaning and meaning behind these and related terms, such as Continuous Testing and Continuous Deployment.

Overview

The assembly line in the factory produces consumer goods from raw materials in a fast, automated, and repeatable manner. Similarly, software delivery pipelines generate release versions from source code in a fast, automated, and repeatable manner. The overall design of how to accomplish this work is called "Continuous Delivery" (CD). The process of starting the assembly line is called "Continuous Integration" (CI). The process of ensuring quality is called "continuous testing", and the process of providing the final product to users is called "continuous deployment." Some experts make it all simple, smooth, and efficient. These people are called DevOps practitioners.

What does "continuous" mean?

"Continuous" is used to describe practices that follow many of the different processes I mention here. This does not mean "always running", but "ready to run". In the field of software development, it also includes several core concepts/best practices. these are:

**Frequent release: **The goal behind continuous practice is to be able to deliver high-quality software frequently. The delivery frequency here is variable and can be defined by the development team or company. For some products, one quarter, one month, one week or one day delivery may be frequent enough. For others, it may be possible to deliver multiple times a day. The so-called continuous also has an "occasional, on-demand" aspect. The ultimate goal is the same: to provide end users with high-quality software updates in a repeatable and reliable process. Usually, this can be done with little or no user interaction or knowledge (think device updates).
Automated process: The key to achieving this frequency is to use an automated process to handle all aspects of software production. This includes building, testing, analysis, version control, and in some cases deployment.

**Repeatable:** If the automated process we use always has the same behavior given the same input, the process should be repeatable. In other words, if we take a certain historical version of the code as input, we should get the same deliverable output. This also assumes that we have external dependencies of the same version (that is, we do not create other deliverables used by this version of the code). Ideally, this also means that the processes in the pipeline can be versioned and rebuilt (see the DevOps discussion later).

Fast iteration: "Fast" is a relative term here, but regardless of the frequency of software updates/releases, the expected continuous process will convert source code into deliverables in an efficient manner. Automation is responsible for most of the work, but the process of automation may still be slow. For example, for a product that needs to release candidate version updates multiple times a day, it may be too slow for a round of integrated testing to take more than half a day.

What is a "continuous delivery pipeline"?

Multiple different tasks and jobs that convert source code into a releasable product are usually connected in series to form a software "pipeline". After an automatic process is successfully completed, the next process in the pipeline will be started. These pipelines have many different names, such as continuous delivery pipelines, deployment pipelines, and software development pipelines. Generally speaking, the program manager manages the definition, operation, monitoring and reporting of each part of the pipeline when the pipeline is executed.

How does the continuous delivery pipeline work?

The actual implementation of the software delivery pipeline can vary greatly. There are many programs that can be used in the pipeline for source code tracking, construction, testing, index collection, version management and other aspects. But the overall workflow is usually the same. A single business process/workflow application manages the entire pipeline, and each process runs as an independent job or is managed in stages by the application. Usually, in a business process, these independent tasks are defined in a grammar and structure that can be understood by the application and can be used as a workflow management.

These jobs are used for one or more functions (build, test, deploy, etc.). Each job may use different technologies or multiple technologies. The key is that the job is automated, efficient, and repeatable. If the job is successful, the workflow manager will trigger the next job in the pipeline. If the job fails, the workflow manager will alert developers, testers, and others so that they can correct the problem as quickly as possible. This process is automated, so errors can be found faster than running a set of processes manually. This rapid troubleshooting is called fail fast, and is equally valuable in reaching the end of the pipeline.

What does "fail fast" mean?

One of the jobs of the pipeline is to process changes quickly. The other is to monitor different tasks/jobs that create releases. Since code that fails to compile or fails the test can prevent the pipeline from continuing to run, it is very important to quickly notify users of this situation. Fast failure refers to a way to discover problems as soon as possible in the pipeline process and quickly notify users, so that the problem can be corrected in time and the code can be resubmitted to make the pipeline run again. Usually in the pipeline process, you can check the history to determine who made the modification and notify the person and his team.

Should all continuous delivery pipelines be automated?

Almost all parts of the pipeline should be automated. For some parts, there are places where human intervention/interaction may be meaningful. An example might be user-acceptance testing (allowing end users to try out the software and make sure it reaches the level they want/desired). Another situation may be that users want to have more human control when deployed to a production environment. Of course, if the code is incorrect or does not run, manual intervention is required.

With the background of understanding the meaning of "continuous", let us look at the different types of continuous processes and their meaning in the context of software pipelines.

What is "continuous integration"?

Continuous integration (CI) is the process of automatically detecting, pulling, building, and (in most cases) unit testing after source code changes. Continuous integration is the link that starts the pipeline (although certain pre-verifications - often called pre-flight checks - are sometimes classified before continuous integration).

The goal of continuous integration is to quickly ensure that the new changes submitted by developers are good and suitable for further use in the code base.

How does continuous integration work?

The basic idea of ​​continuous integration is to allow an automated process to monitor whether one or more source code repositories have changed. When the changes are pushed to the warehouse, it will monitor the changes, download a copy, build and run any related unit tests.

How does continuous integration monitor changes?

Currently, the monitoring program is usually an application like Jenkins, which also coordinates all (or most) processes running in the pipeline, and monitoring changes is one of its functions. The monitoring program can monitor changes in several different ways. These include:

Polling: The monitoring program repeatedly asks the code management system, "Is there anything new in the code warehouse that I am interested in?" When there is a new change in the code management system, the monitoring program will "wake up" and complete its work to obtain new code And build/test it.

Periodic: The monitoring program is configured to start the build periodically, regardless of whether the source code has changed. Ideally, if there are no changes, no new content will be built, so this will not add additional costs.

Push: This is the opposite of the monitoring program used for code management system checks. In this case, the code management system is configured to "push" a notification to the monitoring program when the change is submitted to the warehouse. Most commonly, this can be done in the form of a webhook-a hooked program sends a notification to the monitoring program via the Internet when new code is pushed. To this end, the monitoring program must have an open port that can receive webhook information through the network.

What is "pre-inspection" (also known as "pre-inspection")?

Before the code is introduced into the warehouse and continuous integration is triggered, additional verification can be performed. This follows best practices such as test build and code review. They are usually built into the development process before the code is introduced into the pipeline. But some pipelines may also use them as part of their monitoring process or workflow.

For example, a tool called Gerrit allows formal code review, verification, and test builds after the developer pushes the code but before allowing access to the (Git remote) warehouse. Gerrit is located between the developer's workspace and the Git remote warehouse. It will "receive" pushes from developers, and can perform pass/fail verification to ensure that they pass the check before they are allowed to enter the warehouse. This can include detecting new changes and initiating build tests (a form of CI). It also allows developers to conduct formal code reviews at that time. This method has an additional credibility evaluation mechanism, that is, when the changed code is merged into the code base, nothing will be destroyed.

What is "unit testing"?

Unit tests (also called "submission tests") are small special tests written by developers to ensure that the new code works independently. "Independent" here means not relying on or calling other code that is not directly accessible, nor relying on external data sources or other modules. If such dependencies are required to run the code, these resources can be represented by mocks. Impersonation refers to the use of code stubs that look like resources, which can return values, but do not implement any functionality.

In most organizations, developers are responsible for creating unit tests to prove that their code is correct. In fact, a model called test-driven develop (TDD) requires that unit tests be designed first as the basis for clearly verifying the functionality of the code. Because such codes can be changed quickly and with a large amount of change, they must also execute quickly.

Since this is related to continuous integration workflow, developers write or update code in the local work environment and pass unit tests to ensure that the newly developed functions or methods are correct. Usually, these tests take the form of assertions, that is, a given set of inputs of a function or method produces a given set of outputs. They are usually tested to ensure correct flagging and handling of error conditions. There are many unit testing frameworks that are very useful, such as JUnit for Java development.

What is "continuous testing"?

Continuous testing refers to the practice of running an extended range of automated tests as the code passes through the continuous delivery pipeline. Unit testing is usually integrated with the build process as part of the continuous integration phase and focuses on testing that is isolated from other code that interacts with it.

In addition, there can or should be various forms of testing. These can include:

The integration test verifies that the components and services are combined properly.
Functional testing Verifies whether the results of performing functions in the product meet expectations.
Acceptance testing verifies certain characteristics of the product according to acceptable standards. Such as performance, scalability, stress resistance and capacity.

All of these may not exist in an automated pipeline, and the classification boundaries of some different types of tests are not very clear. However, the goal of continuous testing in the delivery pipeline is always the same: through continuous testing levels to prove that the quality of the code can be used in the ongoing release. Based on the principle of rapid continuous integration, the second goal is to quickly find problems and alert the development team. This is often referred to as fast failure.

In addition to testing, what other types of verification can be performed on the code in the pipeline?

In addition to whether the test passed or not, there are some applications that can tell us the number of source code lines that the test case executes (covers). This is an example of an indicator that can measure the amount of code. This metric is called code-coverage and can be counted by tools (such as JaCoCo for Java).

There are many other types of indicator statistics, such as the number of lines of code, complexity, and comparative analysis of code structure. Tools such as SonarQube can inspect the source code and calculate these metrics. In addition, users can also set thresholds for their acceptable "qualified" range of indicators. Then you can set a check for these thresholds in the pipeline, and if the result is not within the acceptable range, then the process terminal. Applications such as SonarQube are highly configurable and can be set to check only what the team is interested in.

What is "continuous delivery"?

Continuous delivery (CD) usually refers to the entire process chain (pipeline), which automatically monitors source code changes and runs them through construction, testing, packaging, and related operations to generate deployable versions, basically without any human intervention.

The goals of continuous delivery in the software development process are automation, efficiency, reliability, repeatability and quality assurance (through continuous testing).

Continuous delivery includes continuous integration (automatically detect source code changes, execute the build process, run unit tests to verify the changes), continuous testing (run various tests on the code to ensure code quality), and (optional) continuous deployment (release through pipelines) The version is automatically provided to the user).

How to identify/track multiple versions in the pipeline?

Version control is a key concept of continuous delivery and pipelines. Continuous means being able to integrate new code frequently and provide updated versions. But this does not mean that everyone wants "the latest and best." This is especially true for internal teams who want to develop or test a known stable version. Therefore, these versioned objects created by the pipeline and easily stored and accessed are very important.

Objects created from source code in a pipeline can often be referred to as artifacts. Artifacts should have versions applied to them when they are built. The recommended strategy for assigning version numbers to artifacts is called semantic versioning. (This also applies to versions of dependent artifacts imported from external sources.)

The semantic version number has three parts: a major version (major), a minor version (minor) and a patch version (patch). (For example, 1.4.3 reflects major version 1, minor version 4, and patch version 3.) The idea is that a change in one part represents the update level in the artifact. The major version is only incremented for incompatible API changes. When features are added in a backward-compatible manner, minor versions will increase. When the backward compatible version bug fixes, the patch version will increase. These are recommended guidelines, but as long as the team does this in a consistent and easy-to-understand manner throughout the organization, the team is free to change this approach. For example, the number added each time the build is completed for the release can be placed in the patch field.

How to "distribute" artifacts?

The team can assign a promotion level to the artifact to indicate that it is suitable for testing, production, and other environments or uses. There are many ways. You can use applications such as Jenkins or Artifactory for distribution. Or a simple solution can add a tag at the end of the version number string. For example, -snapshot can indicate the latest version (snapshot) of the code used to build the artifact. Various distribution strategies or tools can be used to "promote" artifacts to other levels, such as -milestone or -production, as a marker for the stability and completeness of the version of the artifact.

How to store and access multiple artifact versions?

Versioned artifacts built from source code can be stored by an application that manages an artifact repository. The artifact repository is like a version control tool for building artifacts. Applications like Artifactory or Nexus can accept versioned artifacts, store and track them, and provide methods for retrieval.

Pipeline users can specify the versions they want to use and use pipelines in these versions.

What is "continuous deployment"?

Continuous deployment (CD) refers to the idea of ​​automatically providing the release version in the continuous delivery pipeline to end users. Depending on the user's installation method, it may be automatic deployment in a cloud environment, app upgrade (such as an application on a mobile phone), update the website, or only update the list of available versions.

An important point here is that just because continuous deployment is possible does not mean that every set of deliverables from the pipeline is always deployed. It actually means that each set of deliverables through the pipeline has been proven to be "deployable." This is largely done by successive levels of continuous testing (see the continuous testing section in this article).

Whether the release results of pipeline construction are deployed can be controlled by manual decision, or by using various methods of "trial" release before full deployment.

Are there any ways to test the deployment before it is fully deployed to all users?

Since having to roll back/undo deployment to all users can be a costly situation (both technically and user-perceived), there are already many technologies that allow "trying" to deploy new features and easily "undo" when problems are discovered they. These include:

/绿测试/部署

In this method of deploying software, two identical host environments are maintained-one "blue" and one "green". (The color is not important, it is only used as an identification.) Correspondingly, one of them is the "production environment" and the other is the "pre-release environment".

In front of these instances are dispatch systems, which act as customer "gateways" for products or applications. By pointing the scheduling system to the blue or green instance, customer traffic can be directed to the desired deployment environment. In this way, switching to which deployment instance (blue or green) is fast, simple and transparent for users.

When the new version is ready for testing, it can be deployed to a non-production environment. After testing and approval, the scheduling system settings can be changed to direct incoming online traffic to it (so it will become the new production site). Now, it has been used as a production environment instance for the next release candidate.

In the same way, if a problem is found in the latest deployment and the previous production instance is still available, a simple change can divert customer traffic back to the previous production instance-effectively "offline" the problem instance and roll it back to the previous version. Then the problematic new instance can be repaired in other areas.

金丝雀测试/部署

In some cases, switching the entire deployment through blue/green releases may not be feasible or desirable. Another method is to test/deploy for canary. In this model, a portion of customer traffic is redirected to the new version deployment. For example, a new version of the search service can be deployed with the production version of the current service. Then, 10% of the search queries can be diverted to the new version to test it in a production environment.

If there is no problem serving the new version of those traffic, then more traffic may be gradually diverted to the past. If there are still no problems, then over time, the new version can be incrementally deployed until 100% of the traffic is scheduled to the new version. This effectively "replaces" the previous version of the service and makes the new version effective for all customers.

Function switch

For new features that may need to be easily turned off (if problems are found), developers can add feature toggles. This is the if-then software function switch in the code, and the new code is activated only when the data value is set. This data value can be a globally accessible location, and the deployed application will check if new code should be executed at that location. If the data value is set, the code is executed; if not, it is not executed.

This provides developers with a remote "kill switch" to turn off new features when problems are discovered after deployment to the production environment.

Black box release

In a dark launch, the code is gradually tested/deployed to the production environment, but users will not see the changes (hence the word dark in the name). For example, in the production version, some parts of the web page query may be redirected to a service that queries a new data source. Developers can collect this information for analysis without exposing any information about interfaces, transactions or results to users.

The idea is to get real information about how the candidate version performs under the load of the production environment without affecting users or changing their experience. Over time, more loads can be scheduled until a problem is encountered or the new feature is deemed ready for everyone to use. In fact, the function switch flag can be used for this kind of black box release mechanism.

What is "Operation and Maintenance Development"?

DevOps is a series of ideas and recommended practices on how to make it easier for development and operations teams to develop and release software. Historically, development teams have developed products, but have not installed/deployed them in a regular and repeatable way like customers do. Throughout the cycle, this set of installation/deployment tasks (and other support tasks) is left to the operation and maintenance team. This often leads to a lot of confusion and problems, because the operation and maintenance team only starts to intervene at a later stage and must complete their work in a short time. Similarly, development teams are often at a disadvantage-because they have not fully tested the product's installation/deployment features, they may be surprised by the problems that arise in the process.

This often leads to serious disconnection and lack of cooperation between development and operations teams. The DevOps concept advocates a comprehensive and collaborative working method of development and operation and maintenance throughout the entire development cycle, just like continuous delivery.

How does continuous delivery intersect with operation and maintenance development?

The continuous delivery pipeline is the realization of several DevOps concepts. The later stages of product development (such as packaging and deployment) can always be completed in each run of the pipeline, rather than waiting for a specific time in the product development cycle. Similarly, from development to deployment, development and operation and maintenance can clearly see when things work and when they don’t work. To make the continuous delivery pipeline cycle successful, not only through development-related processes, but also through operations and maintenance-related processes.

Going further, DevOps recommends that the infrastructure that implements the pipeline is also considered code. In other words, it should be automatically configured, trackable, easy to modify, and trigger a new run when the pipeline changes. This can be done by implementing the pipeline as code.

What is "pipe as code"?

Pipeline-as-code is a general term for creating pipeline jobs/tasks by writing code, just like developers writing code. Its goal is to represent the pipeline implementation as code so that it can be stored, reviewed, tracked along with the code, and can be easily rebuilt if something goes wrong and the pipeline must be terminated. There are several tools that allow this, such as Jenkins 2.

How does DevOps affect the production software infrastructure?

Traditionally, each hardware system used in the pipeline has supporting software (operating system, applications, development tools, etc.). In extreme cases, each system is manually set to customize. This means that when the system has problems or needs to be updated, this is usually also a custom task. This approach violates the basic concept of continuous delivery, which is an environment that is easy to reproduce and traceable.

Over the years, many applications have been developed for standardized delivery (installation and configuration) systems. Similarly, virtual machines are developed to simulate computer programs running on other computers. These VMs need a hypervisor to run on the underlying host system, and they need their own copy of the operating system to run.

Later there was a container (container). Although containers are conceptually similar to VMs, they work differently. They only need to use some existing operating system structure to divide the isolation space, and do not need to run separate programs and copies of the operating system. Therefore, they behave like VMs to provide isolation without excessive overhead.

VMs and containers are created according to configuration definitions, so they can be easily destroyed and rebuilt without affecting the host system on which they are running. This allows the system running the pipeline to be rebuilt. In addition, for containers, we can track changes to their build definition files - just like the source code.

Therefore, if we encounter problems in VMs or containers, we can destroy and rebuild them more easily and quickly, instead of trying to debug and repair them in the current environment.

This also means that any changes to the pipeline code can trigger a new run of the pipeline (via CI), just like changes to the code. This is one of DevOps' core concepts about infrastructure.

Guess you like

Origin blog.csdn.net/ichen820/article/details/115211978