Enjoy the dividends of cloud native technology, big data should not be left behind

In August 2021, Databricks received US$1.6 billion in Series H financing, just 7 months after the US$1 billion Series G financing. After this round of financing, Databricks was worth US$38 billion and became a super unicorn.

Databricks is a big data software company that provides Spark-based cloud services for data integration, data cleansing, data management, and others. The success of Databricks has given great confidence to many Chinese entrepreneurs on the other side of the ocean, and Peng Feng, CEO and co-founder of Zhilingyun, is one of them.

Databricks' work at the data level has given Peng Feng great encouragement. He has been engaged in big data work in Silicon Valley companies such as Twitter and Ask.com for many years. He is also very optimistic about Databricks. In 2016, Peng Feng returned from Silicon Valley in the United States and founded Zhilingyun, which mainly provides big data related products and services.

In May of the same year when Databricks achieved great success, Zhilingyun optimized and upgraded the product, added support for Kubernetes, and launched the Kubernetes Data Platform (KDP for short), a cloud-native data platform based on open source technology, which greatly simplifies big data applications. Deployment paves the way for big data applications to embrace cloud-native technologies. This is an innovative and bold move. There are many open source big data components, and the technical threshold is very high. Kubernetes is also known for its complexity. When big data meets Kubernetes, the challenges can be imagined. Why launch such a platform? In such a complex situation, how did Zhilingyun launch a reliable and easy-to-use platform? Peng Feng answered these questions.

Cloud native big data platform strikes

There is no doubt that cloud native is the development trend of technology. Due to the many advantages of cloud-native support for rapid deployment and cross-platform capabilities, it has been rapidly popularized in recent years. According to the CNCF 2022 survey report, 44% of organizations have deployed cloud-native technologies on a large scale in their production environments, 35% of organizations have partially deployed them in production environments, and less than 10% of companies have not tried cloud-native technologies. .

In the process of embracing cloud-native technology, the progress of different types of applications is different. Among them, stateless applications (such as Web services) are the earliest on the road to cloud-native applications, while stateful applications (such as databases, big data) are Relatively slower. The reason behind this is that early Kubernetes did not support stateful applications such as databases and big data very well.

"But this trend is certain. Whether it is a stateful web application or a stateless database and big data application, it will eventually become cloud-native." Peng Feng said.

Driving this trend are the many benefits that cloud-native technologies bring. Once an enterprise realizes part of the business cloud-native and tastes the sweetness of cloud-native technology, it becomes a matter of course for the data platform to realize cloud-native. Peng Feng believes that there are four benefits to running a big data platform on Kubernetes:

First, unified management. All tools are unified, and all policy scheduling is also unified.

Second, efficient use of resources. In the past, the resources of the big data platform and other business platforms were independent, and resources had to be reserved for both. Now resources are unified and mixed, and resources can be shared, which greatly improves the utilization rate of resources.

Third, elastic expansion. Traditional big data platforms mainly rely on people, and human participation is also required when expanding. Now you can use the capabilities of Kubernetes to realize automatic deployment and expansion.

Finally, simplify operation and maintenance and improve the stability of the entire system. Because all applications are on Kubernetes, those operation and maintenance tools developed on Kubernetes can also be used in big data applications.

According to Peng Feng, Zhilingyun KDP is the first publicly available container big data platform based on Kubernetes, and also the first true Kubernetes cloud-native big data platform on the market. The reason why the word "true" is emphasized here is that all components in KDP have been reconstructed through containers and incorporated into the standard management system of Kubernetes.

Why now?

At present, Kubernetes has become the de facto standard configuration of cloud computing, and also the best partner of cloud native. Therefore, adding support for Kubernetes and upgrading the product makes KDP the first in the market that can be completely built on Kubernetes. The deployed containerized cloud-native big data platform actually complies with the development trend of technology. There are actually two reasons for supporting Zhilingyun to do so. First, Kubernetes’ support for stateful applications is becoming more and more mature, and many databases and big data software have begun to support Kubernetes. Especially in 2021, two landmark events have occurred in the field of big data. First, Apache Spark supports Kubernetes in March 2021, and then Kafka also publicly supports Kubernetes in May, which means that the core big data components now support Kubernetes. Kubernetes.

Secondly, the acceptance of Kubernetes has reached the "stage that is more suitable for big data on Kubernetes". At present, many top customers are eagerly looking for such a solution.

Of course, KDP supports all mainstream open source big data components, such as HDFS, Hbase, Spark, Flink, Kafka, etc. Enterprises can choose these components according to their needs. Peng Feng said that one of the important tasks done by Zhilingyun is to establish a unified middle layer between Kubernetes and these open source components, open up the connection between each component, and realize the unified management and scheduling of each component.

This is not easy to do, especially in the standardization of big data components, unified resource management, and running all workloads in a Kubernetes environment are relatively complex tasks. For example, to open up users among Hadoop, Hive, and Spark, complex manual configuration is required on traditional big data platforms. Now, KDP based on Kubernetes can easily open up user management and achieve standardization. It can easily communicate with existing systems.

Peng Feng explained that Zhilingyun is actually an intermediate management team. The reason why this middle management was difficult before is because Hive, Sprk, Kafka... all have their own release methods, and big data components are not standardized, so it is very difficult to achieve unified management. Now, with the support of Kubernetes, all release management methods can be standardized, which also makes it possible to build an intermediate management layer, and KDP is equivalent to an intermediate management layer. For example, everyone must have used the Windows resource manager , KDP is like a resource manager for big data components. It manages all big data components and allows users to use them more conveniently, thereby greatly improving system operating efficiency and reducing operation and maintenance costs. 

89acba6ab1ba8efc26548d3114e490bf.png

KDP management interface diagram

Specifically, KDP can standardize configuration management, that is, adopt a unified Kubernetes file configuration method to standardize the configuration management of big data components, simplify the integration of big data components and Kubernetes clusters; realize efficient resource utilization, and cluster resources as a shareable The resource pool realizes the mixed deployment of real-time and offline operations, and the utilization rate of cluster resources is increased to 60% compared with 30% of the traditional big data platform; elastic expansion, using the elastic expansion technology of Kubernetes, calmly copes with the performance bottleneck of computing operations, Realize the dynamic expansion of computing resources and cluster resources; simplify operation and maintenance, based on the Kubernetes standard Operator operation mode, the unified operation and maintenance interface completes the deployment, upgrade, expansion, backup and other operations of big data components, and improves the operation and maintenance efficiency.

KDP just in time

Hadoop has been the big data platform of choice for most enterprises for a long time. With the gradual decline of Hadoop, especially after Cloudera announced that it will no longer maintain and upgrade its Hadoop versions CDH and HDP, the big data platform presents a variety of features, which brings certain troubles to users' choices. Add the complexity of big data technology itself, especially based on open source big data platforms, each open source component has its own installation process and operation and maintenance mechanism, and its installation and operation and maintenance have high requirements for technical personnel. Moving these components to Kubernetes presents considerable challenges.

The pain points of users are also opportunities for entrepreneurs. As an innovative company rooted in the big data market, when many people are still worried about the maturity of this technology, Zhilingyun follows the technological trend and builds a cloud-native big data platform. This step can be described as bold and resolute.

Peng Feng said that compared with other big data platform products in the market, the biggest highlight of Zhilingyun's big data platform KDP is that it is a containerized cloud-native big data platform that can be fully deployed on Kubernetes, integrating big data components and data applications Incorporate the Kubernetes management system and standardize system management; at the same time, KDP is fast and easy to interface and adapt to various existing systems and architectures of customers.

"We do not want to provide enterprises with an independent big data technology architecture or basic capabilities, but to add data capabilities to the existing development data platform of enterprises, which is lighter and more in line with the development trend of cloud-native integration and unified management. .” Peng Feng said.

After the launch of Zhilingyun KDP, it has received positive responses from leading customers in the market. Peng Feng said that in the future, KDP will open source and adopt a model of consulting and service fees. His goal is to enable more companies to migrate their data platforms to Kubernetes through KDP, while providing enterprise-level security, enterprise-level operations management, and enterprise-level development tool support, and finally turning KDP into a solution release platform.

"If the usage of this platform is large enough, we can provide users with semi-finished AI modules and big data modules. Users can combine AI and data capabilities into the business applications they need in a very simple way. This is our goal. The ultimate goal." Peng Feng said.

At present, the trend of big data applications migrating to cloud-native platforms is also very obvious. According to Gartner's forecast, data applications deployed on cloud-native platforms will increase from 30% in 2021 to 95% in 2025, which means that by 2025, 95% of data applications will run on cloud-native platforms. CNCF survey data also corroborates this trend. According to CNCF's 2022 market survey, 71% of organizations use databases in Kubernetes, a year-on-year increase of 48%; 35% of organizations use big data, a year-on-year increase of 36%. Judging from these data, Zhilingyun KDP is at the right time.

About LinkTimeCloud

Zhilingyun is the innovative leader of cloud-native big data technology in China. It provides enterprise customers with cloud-native DataOps product series based on cloud-native big data platform, including cloud-native data integration development platform and cloud-native data asset operation platform. Zhilingyun helps enterprises build data and AI middle platforms through products and services, easily build a closed loop of business data capabilities, establish a digital operation system, and finally complete data-driven digital transformation.

Zhilingyun has served many well-known enterprises at home and abroad in the fields of energy, education, medical health, Internet of Things, finance, etc., and has carried out close cooperation with many partners in the field of cloud-native ecology, making full use of their respective advantages to jointly serve Enterprise customers provide more valuable cloud computing, big data products and technical services.

- FIN -

7327a9dc6a3b33248d6a9881df576c2e.png

e83d796964743003e930d6123f86efee.gifClick "Read the original text" to learn more about KDP

Guess you like

Origin blog.csdn.net/LinkTime_Cloud/article/details/130418103