The real cloud-native big data platform makes Kubernetes a new one

As an open source container orchestration engine, Kubernetes has been loved by developers since its launch in 2014. No one thought that it would achieve such a great success. Today, in the wave of cloud-native technology development, Kubernetes, as the de facto standard in the field of container orchestration and a key project in the field of cloud-native, has become the standard configuration of cloud-native.

 1 

Kubernetes  turned out: rewriting the container layout market pattern

When it comes to Kubernetes, you can't get around Docker.

In 2010, a company called dotCloud developed a set of internal tools based on the Go language launched by Google, and was later named Docker. As an open platform for developing, publishing, and running applications, Docker's popularity has risen so fast that even giants such as Google, Microsoft, and Amazon favor it.

However, as the business scale gradually expands and there are more and more containers, a series of new problems arise. How to coordinate and schedule these containers? How can I upgrade my application without interrupting service? How to monitor the health of the application? How to batch restart programs in containers...

What is needed to solve these problems is container orchestration technology, which can abstract many machines, deploy, manage and monitor multiple containers, and serve as a real PaaS platform, allowing users to deploy their own container applications.

It turns out that the container itself has no "value", what is valuable is the container arrangement.

Therefore, around 2014, Docker, Mesos, and Google successively released their own PaaS platforms, and the battle for container orchestration officially began.

With its great success in the container engine market, it is only natural for Docker to enter the field of container orchestration. Docker released Swarm in early 2015. The Swarm platform is good at seamlessly integrating with the Docker ecosystem, and users can transition at low cost.

Mesos became one of the first container orchestration frameworks to support Docker containers in 2014. Its biggest advantage is its maturity when running critical tasks. It is more mature and reliable than other container technologies, so it is adopted by companies such as Twitter, Apple, and Netflix. adopted.

In June 2014, Google released Kubernetes as an open source version of Borg, Google's secret weapon that has been strictly kept secret for more than ten years. That is to say, from the very beginning, Kubernetes has reached a height that is difficult for others to reach. The proposal of each core feature is almost born out of the design and experience of the Borg/Omega system that has been running in Google for many years, and has been implemented in the open source community. Finally, thanks to the contributions of the entire community, many defects and problems left in the Borg system were improved and fixed.

What's more valuable is that Kubernetes does not directly extend Borg, but is designed from scratch based on these valuable experiences, adopting the most advanced design concepts without any historical burden.

It is the unique advanced nature and completeness embodied by Kubernetes. Compared with the "immature" Docker technology stack and the "old" Mesos community, although Kubernetes debuted late, it has obvious advantages as a latecomer.

In May 2015, Kubernetes' search popularity on Google far surpassed that of Mesos and Docker Swarm, and since then it has soared, and the era of the three pillars in the field of container orchestration engines is over.

In September 2017, Mesos announced support for Kubernetes.

In October 2017, Docker officially supported Kubernetes.

In March 2018, he officially graduated from CNCF and began to take the position of the first brother of container orchestration.

Today, there are two more landmark events in the field of cloud-native big data, that is, in March 2021, Apache's Spark supported Kubernetes; in May of the same year, Kafka also publicly supported Kubernetes, marking that the core big data components now support K8s .

 2 

Kubernetes market status: more than 53% of enterprises migrate big data applications to it

Perhaps, many people think that Kubernetes is a complex software that is difficult to monitor and manage, but in the past few years, Kubernetes has experienced incredible development. As more and more enterprises put it into use, it has changed from a scientific research topic to an IT The mainstream technology in the industry, the benefits here completely outweigh the disadvantages.

One of the clearest signs that Kubernetes is becoming a mainstream technology is the rapid growth in the number of clusters being deployed, according to new research from Dimensional Research. When this question was asked in 2020, 30% of companies had five or fewer clusters and only 15% had more than 50 clusters. According to the 2022 survey, only 12% of companies have no more than 5 clusters, while 29% have more than 50 clusters, and according to future plans, there may be more explosive growth in the coming year.

Pay attention to the official account and reply to the keyword [Clearpath] to get "Kubernetes Big Data Report 2022 (Chinese Version)"

In the Pepperdata report, more than half (53%) of respondents said they were "migrating big data applications to Kubernetes to reduce overall spending." Most big data applications are migrated to Kubernetes; about 10% of respondents indicated that they would migrate all applications there. (Reply to the keyword [Pepperdata] to get the full Chinese version of the report)

In addition, in the spring of 2022, according to a survey report by research firm Clearpath Strategies, 83% of respondents attribute more than 10% of their operating income to running data on Kubernetes, and one-third of companies have noticed that business productivity has improved. double.

Enterprises are embracing containers because they make better use of resources.

According to the Dimensional Research report, 99% of respondents said they are aware of the advantages of deploying Kubernetes. The top two advantages remain the same, still improving resource utilization (59%) and simplifying application upgrades and maintenance (49%). The third is the realization of the migration to the cloud environment (42%), and the fourth is the realization of the hybrid cloud model (40%). The percentage of respondents choosing to reduce public cloud costs (34%) also increased by 6 points from last year.

About one-third of respondents selected two options added this year: enabling operations team members to work and apply skills more efficiently (32%) and eliminating the inefficiencies of previously siled teams (28%). Kubernetes reduces the friction that can slow down operations, helps maximize IT resource utilization, and enables teams to work more efficiently and closely together.

 3 

Kubernetes has a lot to offer: Transforming traditional big data platforms

Reviewing the development history of Kubernetes and analyzing its current market status undoubtedly does not mean that K8s has been widely used in enterprises. But at present, when domestic enterprises use K8s, most of them are doing cloud computing-related scheduling. For the field of big data, enterprises are still managing another complex system, that is, the traditional big data platform.

At first, if an enterprise wants to use a big data platform, it needs to purchase at least a dozen servers and find professionals to install each component of big data. After installation, it also needs a development platform, an operation and maintenance platform, and purchase various tools. Construction and use costs, thresholds and decision-making risks are relatively high.

Subsequently, the shortcomings and various drawbacks of the traditional big data platform gradually emerged. For example, multiple departments share clusters, without resource isolation and restrictions, and interact with each other; the system relies on complex manual deployment, and the operation and maintenance costs are high; resource utilization is low, and it is difficult to integrate existing digital systems; there is no standard large-scale The release process of data components cannot form the customer's independent data capabilities.

Now, migrating such a big data platform to K8s will solve the above problems. Based on this requirement and background, a term called Data on Kubernetes comes out.

In October last year, KubeCon in North America just ended. For the first time, we organized a special session on Data on Kubernetes to talk about how to run data applications on Kubernetes. This also shows from the side that this emerging hot field has received great attention. In the industry report organized by the DoK community, it is shown that these enterprises using Data on Kubernetes, why they want to migrate these applications to Kubernetes, based on the first two are to ensure the unity of management and simplify management.

3d714d60a1cf688e980b922722574a2b.png

So far, a cloud-native big data platform that is often called "live" and "pure" by customers has surfaced. It is independently developed by Zhilingyun. It is the first containerized cloud that can be fully deployed on Kubernetes in the market. Native big data platform--Kubernetes Data Platform (KDP for short).

 4 

Kubernetes transformation  ideas: All components are transformed with K8s to make a real cloud-native big data platform

KDP can be called a pure K8s cloud-native big data platform because many companies in the industry are doing it, but the main difference between them is that they take different routes. KDP is currently the first public container built entirely based on K8s in the market Big data platform.

The big data platform built entirely based on K8s has already been practiced in Silicon Valley, and the trend is obvious. However, many domestic manufacturers are still dealing with the original problem of traditional big data platforms. Generally speaking, domestic users prefer to migrate smoothly from the existing architecture, and manufacturers are more cautious.

For example, some manufacturers develop big data platforms based on their own scheduling systems and distributed computing systems. Although they are also doing K8s transformation and moving part of the scheduling work to K8s, most of the components are still based on the original big data platform. The operation of the system has not really realized the data platform under the cloud native architecture.

Therefore, although most big data companies have done a lot of work on K8s, the difference of Zhiling Cloud is that it builds the first real K8s cloud-native big data platform. The reason why the word "real" is emphasized is that all components in the platform have been reconstructed through containers and incorporated into the K8s standard management system, not just a part.

The value of this is obvious. Even across different environments, as long as the underlying infrastructure is the K8s environment, there is no need to repeatedly deal with the configuration of the physical infrastructure, and there is no need for code modification, and the big data platform can be deployed smoothly.

In addition, the underlying support of the "cloud-native big data platform" is a globally shared platform. Users can migrate existing systems to resource pools to achieve higher resource utilization. At the same time, the cloud-native storage-computing separation architecture can also manage cold and hot data storage separately, that is, for different application scenarios, choose different storage media such as mechanical hard disks, solid-state hard disks, and object storage to reduce storage costs.

Of course, KDP allows users to completely remove the dependence on Hadoop, and can directly run all workloads in the K8s environment, unify resource management, facilitate multi-tenant billing management, and greatly reduce operation and maintenance costs.

To sum up, cloud-native big data platforms are often heard, and this time I finally saw something alive.

- FIN -       

b98709e7f9efea3ac49f12ba4de8102b.png

More exciting recommendations

Guess you like

Origin blog.csdn.net/LinkTime_Cloud/article/details/128979427