With an increase of 1.2w stars in a year, can Dapr lead the future of cloud-native middleware?

1.png

Author | Xiaojian Ao, Senior Technical Expert of Alibaba Cloud, Dapr Maintainer

Dapr is Microsoft's open source distributed runtime in October 2019. The official version of v1.0 was just released in February of this year. Although it has only been one and a half years since its launch, Dapr has developed very rapidly and has already received 1.2w stars on GitHub. Alibaba is an in-depth participant and early adopter of the Dapr open source project. It took the lead in production. There are more than a dozen applications in the group using Dapr; there are currently 2 Dapr members, which is the Dapr project with the most code contributions other than Microsoft company of.

Although Dapr has a high degree of attention abroad, its popularity in China is very low, and the small amount of Dapr information available is biased towards news and brief introductions, lacking in-depth interpretation of Dapr. When Dapr v1.0 is released, I hope this article can help you to form an accurate understanding of Dapr: grasp the development context of the Dapr project, understand its core values ​​and vision, and comprehend the "Tao" behind the Dapr project "—— Cloud native.

Review: Service Mesh principle and direction

1. Definition of Service Mesh

First, let us quickly review the definition of "Service Mesh", which is the beginning of the Dapr story.

The following is an excerpt from my speech "Service Mesh: Next Generation Microservices" at QCon Shanghai in October 2017:

Service Mesh is an infrastructure layer that handles communication between services. Modern cloud-native applications have complex service topologies, and the service grid is responsible for the reliable delivery of requests in these topologies. 
In practice, the service grid is usually implemented as a set of lightweight network agents, which are deployed together with the application and are transparent to the application.

2.png

In the definition of Service Mesh, the key features of Service Mesh are briefly described:

  • Positioning the infrastructure layer;

  • The function is communication between services;

  • Use Sidecar deployment;

  • Special emphasis is placed on being non-intrusive and transparent to applications.

Those who are familiar with Service Mesh must be familiar with the following picture:

3.png

2. Sidecar mode

Compared with the traditional RPC framework, the innovation of Service Mesh lies in the introduction of the Sidecar mode:

4.png

After the introduction of Sidecar, the communication between services is taken over by Sidecar, and Sidecar is uniformly controlled by the control plane, which realizes the sinking of communication capabilities between services and greatly simplifies applications.

Let's quickly review the basic ideas of Service Mesh:

5.png

  • Before the introduction of Sidecar: business logic and non-business logic are mixed in one process, and the application has both business logic and various non-business functions (reflected in various client SDKs).

  • After the introduction of Sidecar: the function of the client SDK is stripped off, the business process is focused on business logic, and most of the functions in the SDK are disassembled into independent processes and run in the mode of Sidecar.

By introducing Sidecar mode, Service Mesh successful implementation of the separation of concerns and independently maintain two major goals.

3. Development Trend of Service Mesh

Taking the Istio project as an example, I summarized the development trend of Service Mesh in the last one or two years (note that these contents are not the focus of this article, please read it quickly and simply understand it):

6.png

1) Protocol support

The communication protocol support in Istio is mainly HTTP and gRPC, and various vendors are providing more protocol support, including Dubbo, Thrift, and Redis. There are also some community forces adding to it, such as the Aeraki project of Zhao Huabing.

2) Virtual machine support

Virtual machine support has recently become an important focus of Istio:

  • Istio 0.2:Mesh Expansion
  • Istio 1.1:ServiceEntry
  • Istio 1.6:WorkloadEntry
  • Istio 1.8: WorkloadGroup and smart DNS proxy
  • Istio 1.9: Virtual machine integration

3) Ease of use

  • Istio 1.5: Single control plane, combining multiple components into istiod (this is one of the biggest architectural adjustments since Istio was open sourced).

  • Istio 1.7: Mainly promotes the Operator installation method, enhances the istioctl tool, and supports starting the application container after the Sidecar is started.

  • Istio 1.8: Improve upgrade and installation, introduce istioctl bug-report

4) Observability

Istio 1.8: Officially remove Mixer, re-implement the Mixer function based on wasm in Envoy (one of Istio's biggest architectural adjustments) Istio 1.9: Remotely obtain and load wasm modules.

5) External integration

Mutual access with non-service mesh systems to achieve smooth migration of applications between the two systems.

  • Istio had planned to provide a unified solution through the MCP agreement.

  • Istio 1.7: The MCP protocol was abandoned and changed to mcp over xds.

  • Istio 1.9: Kubernetes Service API support (alpha), which exposes services to the outside world.

From the content listed above, we can see that Istio has been working very hard to improve itself in the last one or two years, although the process has been a bit tortuous and back-and-forth (such as stubbornly insisting on Mixer and finally obeying the call of the whole community to completely abandon Mixer. I started to support virtual machines and then gave up substantially and then re-emphasized it recently, introducing Galley and then abandoning Galley, introducing MCP and then disguising MCP in disguise), but overall Istio is still working towards the general direction of Product Ready.

Note: Of course, the community is still very dissatisfied with the evolution speed of Istio and the actual status of Product Ready, so that this slogan appeared: Make Istio Product Ready (Again, and Again...).

4. Service Mesh review summary

We quickly reviewed the definition of Service Mesh, the principle of Sidecar mode, and roughly listed the development trend of Service Mesh in the past one or two years, mainly to inform you of this information:

Although Service Mesh has flourished, its core elements have not changed.

From the definition of Service Mesh given by William Morgon, CEO of Linkerd in 2016, to the release of Istio to version 1.9 in 2021, Service Mesh has undergone a lot of changes during the entire six years, but the following three core elements have remained unchanged:

7.png

  • Positioning: The positioning of Service Mesh is always the infrastructure layer that provides communication between services, including HTTP and RPC-supporting HTTP1.1/REST, HTTP2/gRPC, and TCP protocol. There are also some small attempts such as support for Redis and Kafka.

  • Deployment: Service Mesh supports Kubernetes and virtual machines, but they are deployed in the Sidecar mode instead of other methods such as Node deployment and centralized deployment.

  • Principle: The working principle of Service Mesh is to forward the original protocol. In principle, the protocol content is not changed (usually only the header is slightly changed). In order to achieve the goal of zero intrusion, traffic hijacking technologies such as iptables have also been introduced.

Evolution: Cloud-native distributed application runtime

After quickly completing the review of Service Mesh, we begin the second part of this article: When the Sidecar model is further promoted and the above three core elements change, how will the Sidecar model evolve?

1. Practice: More Mesh forms

I used to do Service Mesh related content in the middleware team of Ant Financial. Maybe many friends have known me from that time. At that time, Ant not only made Service Mesh, but also extended the Sidecar mode of Service Mesh to other middleware fields, and successively explored more mesh forms:

8.png

This picture is an excerpt from my keynote speech "Poetry and Distance: Deep Practice of Ant Financial Service Mesh" at Shanghai QCon in October 2019. At that time, we shared a variety of mesh forms including message Mesh, database Mesh, etc.

2. Theory sublimation: the concept of Multi-Runtime is put forward

Recently, more and more projects have begun to introduce the Sidecar mode, and the Sidecar mode has gradually been recognized and accepted by everyone. Just in 2020, Bilgin Ibryam proposed the concept of Multi-Runtime, and carried out a practical summary and theoretical sublimation of various product forms based on the Sidecar model.

First, let's introduce the classmate Bilgin Ibryam, he is the author of the book "Kubernetes Patterns" and the committer of the Apache Camel project, currently working in Red Hat.

In early 2020, Bilgin Ibryam published an article "Multi-Runtime Microservices Architecture", formally proposing a multi-runtime microservice architecture (alias Mecha/Mecha, a very handsome name). In this article, Bilgin Ibryam first summarized the four types of requirements that exist in distributed applications, as the theoretical starting point of Multi-Runtime:

9.png

Among the four types of requirements, the life cycle management requirements are mainly met by PaaS platforms such as kubernetes, while Service Mesh mainly provides point-to-point communication in the network. For other communication modes, such as the pub-sub message communication mode and Not covered, in addition, most of the requirements for state classes and binding classes have little to do with Service Mesh.

The theoretical derivation of Multi-Runtime is roughly like this-based on the above four categories of requirements, if you imitate Service Mesh and start from the traditional middleware model, then there will generally be the following two steps:

10.png

  • Step 1: Move the distributed capabilities required by the application to various runtimes. At this time, there will be a large number of various sidecars or proxies, such as Istio, Knative, Cloudstate, Camel, Dapr, etc. listed above.

  • Step 2: These runtimes will be gradually integrated, leaving only a small amount or even only one or two runtimes. This runtime that provides multiple distributed capabilities is also called Mecha.

After step two is completed, each microservice will be composed of at least one Mecha Runtime and application Runtime, that is, each microservice will have multiple (at least two) runtimes, which is the origin of the name Multi-Runtime / Mecha .

3. Multi-Runtime and cloud-native distributed applications

Ways to introduce the concept of Multi-Runtime / Mecha to cloud-native distributed applications:

11.png

  • Capabilities: Mecha is a universal, highly configurable, reusable component that provides distributed primitives as a ready-made capability.

  • Deployment: Mecha can be deployed with a single Micrologic component (Sidecar mode) or as multiple shares (such as Node mode).

  • Agreement: Mecha does not make any assumptions about the runtime of Micrologic. It is used with multilingual microservices and even monoliths that use open protocols and formats (such as HTTP/gRPC, JSON, Protobuf, CloudEvents).

  • Configuration: Mecha is configured declaratively in a simple text format (such as YAML, JSON), indicating the features to be enabled and how to bind it to the Micrologic endpoint.

  • Integration: Instead of relying on multiple agents to achieve different purposes (such as network agents, caching agents, binding agents), it is better to use a Mecha to provide all these capabilities.

4. Features and Differences of Multi-Runtime

Although the same as Sidecar mode, compared with Service Mesh, Multi-Runtime has its own characteristics:

  • Method and scope of providing capabilities: Multi-Runtime provides distributed capabilities, which are embodied in various distributed primitives required by applications, and are not limited to simple network agents for point-to-point communication between services.

  • Runtime deployment method: The deployment model of Multi-Runtime is not limited to the Sidecar mode. The Node mode may be a better choice in some scenarios (such as Edge/IoT, Serverless FaaS).

  • Interaction with the App: The interaction between Multi-Runtime and the application is open and has API standards. The "protocol" between Runtime and Micrologic is embodied in the API instead of the native TCP communication protocol. In addition, Multi-Runtime does not require non-intrusiveness, and SDKs in various languages ​​are also provided to simplify development.

The difference between Multi-Runtime and Service Mesh is summarized as shown in the figure below:

12.png

5. The essence of Multi-Runtime

So far I have introduced the origin of the Multi-Runtime architecture. I believe readers have an understanding of the characteristics of Multi-Runtime and the difference with Service Mesh. In order to deepen your understanding, let me share my personal perception of Multi-Runtime further:

The essence of Multi-Runtime is an abstraction layer of distributed capabilities for cloud-native applications.

13.png

What is the "distributed capability abstraction layer"?

As shown in the figure above, on the left are four types of requirements for distributed applications: life cycle, network, state, and binding. In terms of requirements, Multi-Runtime should provide distributed applications with various specific distributed capabilities listed under these four categories of requirements. It is easy to understand to provide these capabilities to applications in Sidecar mode, but the key lies in the way in which Multi-Runtime provides these capabilities. Unlike Service Mesh using the original protocol for forwarding, the Multi-Runtime method is:

  • Abstract capabilities as APIs: Many distributed capabilities do not have an industry-wide protocol similar to HTTP. Therefore, the implementation of Multi-Runtime abstracts these capabilities into APIs that have nothing to do with the communication protocol, which is only used to describe the application’s ability to distribute capabilities. Requirements and intentions, try to avoid binding with a certain implementation.

  • Provide multiple implementations for each capability: The capabilities in Multi-Runtime generally provide multiple implementations, including open source products and public cloud commercial products.

  • During development: Here we introduce a concept of "programming in the face of ability", which is similar to "programming not facing implementation, but programming interface-oriented" in programming languages. Multi-Runtime advocates "Capability" programming, that is, application developers should be oriented to abstracted distributed capability primitives, rather than providing concrete implementations of these capabilities at the bottom layer.

  • Runtime: Selecting specific implementations at runtime through configuration does not affect the definition of the abstract layer API, nor does it affect applications developed following the principle of "programming in the face of ability".

Remarks: The universal standard API for distributed capabilities will be the key to the success or failure of Multi-Runtime. Dapr's API also encounters great challenges in design and practice. On this topic, I will write a separate article to elaborate and analyze later.

Introduction: Dapr for Distributed Application Runtime

After a quick review of Service Mesh and a detailed introduction to the multi-runtime architecture, we have laid a good foundation for understanding Dapr. Now we can finally start the official Nie Rong of this article, let us learn about the Dapr project together.

1. What is Dapr?

14.png

Dapr is an open source project initiated by Microsoft. The following is an authoritative introduction from Dapr's official website:

Dapr is a portable, event-> driven runtime that makes it easy for any developer to build resilient, stateless and stateful applications that run on the cloud and edge and embraces the diversity of languages ​​and developer frameworks. Dapr is a portable, event Driven runtime, it enables any developer to easily build flexible, stateless and stateful applications running on the cloud and edge, and embrace the diversity of languages ​​and developer frameworks.

Referring to and comparing the definition of Service Mesh, our analysis of the above definition of Dapr is as follows:

15.png

  • Positioning: Dapr defines itself as a runtime, not a proxy in Service Mesh.

  • Function: Dapr provides various distributed capabilities for applications to simplify application development. The key points mentioned in the above definition are flexible, support stateful and stateless, and event-driven.

  • Multi-language: Support for multiple languages ​​is the natural advantage of the Sidecar model. Dapr is no exception. Considering the number of distributed capabilities that Dapr submits for applications, this may be more valuable to applications than Service Mesh only providing communication capabilities between services. . And because of the existence of the Dapr language SDK, Dapr can be easily integrated with mainstream development frameworks of various programming languages, such as Java and Spring framework.

  • Portability: Dapr applicable scenarios include various clouds (public cloud, private cloud, hybrid cloud) and edge networks. Several key features of the Multi-Runtime architecture are "capability-oriented programming", standard APIs, and runtime configuration implementation. This has brought Dapr excellent cross-cloud and cross-platform portability.

We will expand these features of Dapr in detail in the following introduction. Before we start, here is a little tidbit-the origin of the name of the "Dapr" project:

16.png

2. The function and architecture of Dapr Sidecar

Similar to Service Mesh, Dapr is also based on the Sidecar model, but the functions and usage scenarios provided are more complex than those of Service Mesh, as shown in the following figure:

17.png

Dapr's Sidecar, in addition to supporting inter-service communication like Service Mesh (currently supports HTTP1.1/REST protocol and gRPC protocol, it can also support more functions, such as state (state management), pub-sub (message communication) ), resource binding (resource binding, including input and output).

Each function has multiple implementations. In the above figure, I briefly extracted the common implementations of these capabilities. You can see that there are both open source products and public cloud commercial products in the implementation. Note that this is only a small part of the current Dapr implementation. There are currently more than 70 implementations (called components in Dapr, which we will introduce below), and they are still increasing.

In Dapr's architecture, there are three main components: API, Building Blocks, and Components, as shown in the following figure:

18.png

  • Dapr API: Dapr provides two APIs, HTTP1.1/REST and HTTP2/gRPC, which are functionally equivalent.

  • Dapr Building Blocks: Translated into building blocks, this is the basic unit for Dapr to provide external capabilities, and each building block provides a distributed capability externally.

  • Dapr components: component layer, this is the ability realization layer of Dapr, each component will realize the ability of a specific building block.

In order to help you understand the architecture of Dapr, let's review the essence of Multi-Runtime highlighted earlier:

19.png

The essence of Multi-Runtime is an abstraction layer of distributed capabilities for cloud-native applications.

Combining the Multi-Runtime concept, let's understand the architecture of Dapr Runtime:

20.png

  • Dapr Building Blocks provides "capabilities".

  • Dapr API provides an "abstraction" of distributed capabilities, exposing the capabilities of Building Blocks to the outside.

  • Dapr Components are the concrete "realization" of the capabilities of the Building Block.

3. Dapr's vision and current capabilities

The following picture is from Dapr official, which summarizes Dapr's capabilities and hierarchical structure more comprehensively:

21.png

  • The centered blue is Dapr Runtime: here are the building blocks that Dapr currently provides.

  • Dapr Runtime provides capabilities through remote calls to the outside world. Currently, there are HTTP API and gRPC API.

  • Due to the natural advantages of the Sidecar model, Dapr supports various programming languages, and Dapr officially provides SDKs for mainstream languages ​​(typically Java, golang, c++, nodejs, .net, python). These SDKs encapsulate the ability to interact with Dapr Runtime through HTTP API or gRPC API.

  • At the bottom is the cloud platform or edge network that can support Dapr. Since each capability can be completed by different components, in theory, as long as Dapr’s support is perfect enough, it can be implemented on any platform and can always be found. Available components based on open source products or commercial products of cloud vendors.

Combining the above points, Dapr proposed such a vision:

Any language, any framework, anywhere

That is: it can be developed in any programming language, can be integrated with any framework, and can be deployed on any platform. The following figure is a brief description of Dapr's existing building blocks and the capabilities they provide:

22.png

4. Dapr's control plane

Similar to the architecture of Service Mesh, Dapr also has the concept of a control plane:

23.png

Dapr's control plane components are:

  • Dapr Actor Placement
  • Dapr Sidecar Injector
  • Dapr Sentry
  • Dapr Operator

What is more interesting is that in order to simplify operation and maintenance, Istio has merged the control plane of the microservice architecture, and the control plane has returned to the traditional monolithic model. Dapr's control plane is still a microservice architecture, and I don't know if it will imitate Istio in the future.

Remarks: For the sake of control length, this article does not expand the building blocks and control plane of Dapr in detail. It is expected that there will be a separate article to introduce in detail later, students who are interested in Dapr can follow.

5. Dapr's development history and Alibaba's participation

Dapr is a very new open source project, it has only been developed for about one and a half years, but the community attention is quite good (mainly abroad), and it currently has close to 12,000 stars on GitHub (analog: Envoy 16000, Istio 26000, Linkerd 7000). The main milestones of the Dapr project are:

  • October 2019: Microsoft open sourced Dapr on GitHub and released version 0.1.0.

  • February 2021: Dapr v1.0 version is released.

Alibaba is deeply involved in the Dapr project, not only becoming an early adopter of Dapr as an end user, but also becoming one of the main contributing companies in the Dapr project by fully participating in Dapr's open source development and code contribution, second only to Microsoft:

24.png

  • Mid-2020: Alibaba begins to participate in the Dapr project, trial functions internally and code development.

  • The end of 2020: Alibaba's internal small-scale pilot Dapr, currently more than a dozen applications are using Dapr.

Note: For Dapr's practice in Alibaba, please refer to our article "How Alibaba is using Dapr" just published on Dapr's official blog.

At present, we already have two Dapr Committers and one Dapr Maintainer. In 2021, it is expected that we will have more investment in the Dapr project, including more open source code contributions and landing practices, to personally promote the development of the Dapr project. Welcome more domestic contributors and domestic companies to join the Dapr community.

6. Dapr quick experience

The Dapr installation and quickstudy content are provided in Dapr's official documents, which can help you quickly install and experience the capabilities and usage of Dapr.

In order to experience Dapr more quickly and conveniently, we have provided a super simple introductory tutorial to Dapr through the Alibaba Cloud Knowledge Lab. It only takes about ten minutes to quickly experience the development and deployment process of Dapr:https://start.aliyun.com/course?id=gImrX5Aj

Interested students can actually experience it.

Outlook: The future shape of applications and middleware

In the last part of this article, we look at the future shape of the application and the middle.

1. Cloud-native era background

The first thing to state is that all of these things we have explained are based on a big premise: cloud native.

The picture below is an excerpt from my speech "Poetry and Distance: Deep Practice of Ant Financial Service Mesh" at the QCon Conference in October 2019:

25.png

At that time (2019) we had just completed the exploration and large-scale implementation of Kubernetes and Service Mesh, and started a new exploration of Serverless. In this article, I made a cloud-native implementation summary and suggestions on whether to adopt Service Mesh, which can be roughly summarized as ( Directly quote the original text):

  • One thing is very clear to us: Meshing is a key step for cloud native landing.

  • If cloud native is your poetry and the distance, then Service Mesh is the only way.

  • Kubernetes / Service Mesh / Serverless is the troika of cloud native practice, which complement each other and complement each other.

Today, two years later, looking back at the judgment on the general direction of the cloud native development strategy at that time, I have a lot of feelings. I adjusted the above picture slightly and added the content of Multi-Runtime/container/multi-cloud/hybrid cloud. Modify the following picture:

26.png

Compared with 2019, the concept of cloud native has been widely recognized and adopted: Multi-cloud and hybrid cloud have become the mainstream direction of future cloud platforms; Service Mesh has more landing practices, and more companies use Service Mesh; Serverless has also developed rapidly in the past two years.

The historical trend of cloud native is still in progress, but where will applications and middleware go in the context of cloud native?

2. Application expectations are the direction of middleware

Let us imagine the business application in the most ideal state under the background of cloud native, it is a sweet dream:

  • Applications can be written in any favorite and suitable language, and can be developed quickly and iterated quickly.

  • All the capabilities required by the application can be provided through standard APIs, without the need to care about the underlying implementation.

  • The application can be deployed to any cloud, whether it is a public cloud, a private cloud or a hybrid cloud, without platform and vendor restrictions, and without code modification.

  • The application can elastically scale according to the traffic, withstand the pressure of the wave crest, and can also release resources when it is idle.

  • ……

My personal view on the future form of cloud-native applications is: Serverless will be the ideal form and mainstream development direction for cloud-native applications; and multi-language support, cross-cloud portability, and lightweight applications will be the three main aspects of cloud-native applications. Core demands.

27.png

The application's expectation of cloud native is the way forward for middleware!

In the past few years, middleware has been groping and advancing driven by the beautiful goal of cloud native, and it will continue to be the same in the next few years. Service Mesh explored the Sidecar model, Dapr extended the Sidecar model to larger areas:

  • Perfect multi-language support and the need for lightweight applications push middleware to separate more capabilities from applications.

  • The Sidecar model will be extended to larger areas, and more and more middleware products will begin to Mesh and integrate into Runtime.

  • The natural aversion and circumvention of vendor lock-in will intensify the pursuit of portability, which will further promote the provision of standard and industry-common APIs for the mid-distribution capabilities sinking to Runtime.

  • API standardization and community recognition will become the biggest challenge for Runtime popularization, but at the same time, it will also promote various middleware products to improve their own realization, and realize the running-in and perfection between middleware products and community standard APIs.

Driven by cloud-native demand, multi-language support, cross-cloud portability, and lightweight applications are expected to become the breakthrough point and key development direction of middleware products in the next few years, as shown in the following figure:

28.png

In the current cloud-native field, the Dapr project is a very eye-catching new force. Dapr is a pathfinder, starting a new exploration of the Multi-Runtime concept, and this must be a difficult process. I look forward to having more individuals and companies join us in the Dapr community, explore together and grow together!

Note: Regarding the topic of Dapr API standardization and the challenges Dapr encountered in defining and implementing API, there was a lively discussion on the spot. I will sort out a separate article later, combining the in-depth practice of state API and new ones. The design process of configuration API is going deep, so stay tuned.

end

At the end of this article, let us summarize the full text with this paragraph:

Dapr further expands the use scenarios of the Sidecar mode on the basis of Service Mesh. On the one hand, it provides natural multi-language solutions to meet the needs of cloud-native applications for distributed capabilities, and help applications lightweight and serverless. On the other hand, it provides The application's distributed capability abstraction layer and standard API provide excellent portability for multi-cloud and hybrid cloud deployment, avoiding vendor lock-in.

Dapr will lead the future of applications and middleware in the cloud-native era .

Appendix: Reference Materials

Related references for this article are as follows:

  • Dapr official website and Dapr official documents: part of Dapr introduction and pictures are excerpted from dapr official website.

  • The content and pictures of Multi-Runtime Microservices Architecture: multi-runtime are quoted from this article by Bilgin Ibryam

About the Author

Xiaojian Ao, senior code farmer, microservice expert, Service Mesh evangelist, Dapr maintainer. Focus on infrastructure, Cloud Native supporter, agile practitioner, and architect who stick to the front line of development and polish craftsmanship. Currently working at Alibaba Cloud, responsible for Dapr development on the cloud native application platform.

Guess you like

Origin blog.51cto.com/13778063/2677704