The road to breakthrough for China's first Apache top open source project!

62bb32b40fea8e324fbff16c03bce6f9.gif

[CSDN Editor's Note] Open source has become very popular in recent years, and open source entrepreneurship has become popular. As a business direction that has been particularly popular in the past two years, how can open source entrepreneurs ride the waves in the red sea and occupy a place in the industry? Apache Kylin, the first top-level open source project contributed to ASF led by a Chinese, has made effective explorations. The author of this article, Kyligence co-founder and CTO, Apache Kylin co-creator and PMC Li Yang, shared in-depth Kyligence’s thinking and practice on the path of exploring “non-functional value” in open source entrepreneurship.

Author | Li Yang Editor | He Miao

Produced | "New Programmer" editorial department

9520133224f6048b5163b85803525e93.png

If you use one word to describe the current open source market, it must be "craze". As a member of the industry, I am very happy to witness the enthusiasm for open source. New technologies such as the Internet, cloud computing, big data, Internet of Things, and artificial intelligence continue to develop and are gradually integrated with open source, providing support for a variety of application scenarios. Open source, as an innovation engine in the software industry, continues to grow and develop into a powerful technological innovation model. Today, finance, retail, manufacturing, telecommunications and other industries have embraced open source, and open source has become an important channel for technological innovation. This article will start from the entrepreneurship and practice of the open source project Apache Kylin and its open source commercial version Kyligence, share experiences, and hope to be helpful.

f2d84dca8fdd1be3a8831b6885a06b44.png

Open source development has changed from a blue ocean to a red ocean

Apache Kylin started earlier. Since graduating from the Apache Software Foundation (ASF) in 2015, it has become the first top-level open source project led by a Chinese to contribute to ASF. So far, more than 1,500 companies around the world are using Kylin. Essentially, its core is a multidimensional database and a special OLAP engine. We hope that through intelligent technology and products, enterprises can use valuable data to achieve digital transformation, so as to achieve the vision of changing human data usage habits.

Because we are on the front lines of open source and open source commercialization, we can feel the changes in the open source market more intuitively. With the explosive growth of open source projects, the scale of open source contributors has risen rapidly, and the market for open source commercial companies is also unprecedentedly active. Based on the AI ​​& DATA Landscape calculated by Matt Turck, a partner of FirstMark, the largest venture capital firm in New York, as shown in Figure 1, we can see that open source projects in many vertical tracks have surged, changing from a blue ocean to a red ocean.

61c0782ee75ed382066899d57613858f.png

Figure 1 Part of AI & DATA panorama

In the open source OLAP field that Apache Kylin focuses on, new projects emerging in this field have grown exponentially in the past three years from 2019 to 2021. I often joke, I remember that after Kylin graduated from the Apache Software Foundation in 2015, there seemed to be no competitors in the industry, and only our family was solving this problem. In just a few years, many good start-up companies have emerged in the United States and China.

In addition, from the perspective of China's general environment, the good news is that policies are actively and strategically encouraging enterprises to open source. The "14th Five-Year Plan for National Economic and Social Development of the People's Republic of China and Outline of Long-term Goals for 2035" included open source in the top-level design for the first time, supported the construction of digital underlying technologies, and continuously cultivated new momentum for digital development.

Judging from the survey of the "2021 China Open Source Development Blue Book", the open source contributions from Chinese developers, enterprises, and scientific research institutions continue to increase globally, gaining more and more respect and recognition, and the reputation of China's open source is improving year by year. China's overall position in the global open source ecosystem will also increase simultaneously, and it will gradually occupy a leading position in some advantageous areas. More importantly, open source projects and commercial products based on open source are gradually implemented and used in important industries. This not only means that open source has changed from open source technology to open industry, but also represents that the market's acceptance of open source has greatly improved, which is of far-reaching significance. I also have to lament that technological improvements and changes must ultimately be implemented in application scenarios. This is the "destiny" of technological development.

"Data is the oil of the future" is certainly a familiar saying to everyone. Using data to drive business growth will be the main driving force for refined enterprise operations in the future. However, due to the complexity of data sources and the difficulty of integration between technologies and platforms, the road to enterprise data management and analysis is very tortuous. At present, the users of the open source project Apache Kylin mainly come from financial, retail, Internet, manufacturing, communication and other enterprises at home and abroad, and financial or Internet enterprises invest at least tens of millions to 100 million yuan in data infrastructure a year.

Based on the industry needs and pain points of data-driven business growth, data will be further used in large quantities. When the amount of data increases dramatically, how should companies use technology to process massive data? How to optimize IT costs? How should the IT organizational structure be adjusted to facilitate access and use by company employees? There are still many technical difficulties that need to be overcome behind these problems.

3f61b7f8c6e88056ae3d0757d665007f.png

Exploring the "non-functional value" of open source entrepreneurship

Nowadays, the advantages of open source in technological innovation, efficiency improvement, cost reduction, etc. are further highlighted, and it has become the technical base in various fields. At the same time, the demand gap for information technology stacks brought about by the explosion of digital scenarios in my country is also further expanding. As an engine of technological innovation, open source will continue to promote technological development in various fields and meet the needs of all types of users for "innovative technology + sensitive iteration". Although open source discussions are in full swing, the development of emerging technologies or emerging fields, regardless of the technical level, market level or product level, often faces the problem of talent shortage.

Regarding the resistance to open source development, you can look at it from another angle. First of all, the talent problem may not be a problem of people themselves, but a cost problem. Enterprises in need need to use their own technical personnel to cover the cost of using open source software, or should they obtain stable and reliable services by purchasing enterprise-level open source commercial software? This is a major choice; secondly, another major obstacle to the penetration of open source into enterprises is technology selection. We also mentioned earlier that the current market situation is that there are many types of open source projects and competition is fierce. It is no exaggeration to say that there are nearly twenty open source technology alternatives in the field of data analysis alone. Each technology may have an open source version and an enterprise version, so it often takes a lot of effort for enterprises to select technology and evaluate the results. The above two major choices are the "entanglements of enterprises" that we have actually come into contact with.

Open source and commercialization of open source are routine paths in the market environment. From the perspective of entrepreneurs, we are not anxious. We only need to determine the boundaries of the two projects to find our own foundation.

Based on open source, what is the foundation of its technological development? Safe, reliable and stable.

Can you imagine that hardware is also open source? In fact, hardware also has its own open source market. Is it possible that a complete vehicle, from hardware design to underlying software architecture, is all open source? If such a car existed, and if 3D printing was possible, would you print such a car for your own use? I guess generally no one would do this. Why? Because it does not meet the rigid requirements of safety, reliability, and stability. Returning to the open source supply chain, what will end consumers pay? Personally, they are not paying for a feature. In the field of data analysis, alternative functional solutions already exist. Enterprise users ultimately pay for the security, stability, and reliability of the system, that is, they pay for the non-functional parts.

Enterprise-level procurement also needs to consider "non-functional value". In addition to technology selection, talent support, and functions, the value of "safety, stability, and reliability" is also valued. Complexity itself is the enemy of "safety, stability, and reliability." At this new level, manufacturers who can solve non-functional problems will have greater profit margins.

In the cloud native era, data usage and management requirements are undergoing tremendous changes. For enterprises, if the platform cannot be "clouded", it will become increasingly difficult to adapt to drastic changes that may occur at any time in the external environment. How to meet the needs of enterprise data asset management, fixed/self-service analysis, and data services has become more urgent. Therefore, the threshold for data usage has been lowered and lowered, and the elastic and flexible cloud-native architecture has become hot. So, how do open source startups meet this type of value demand? We will take the service experience of a cloud company as an example to analyze its scenarios and pain points, hoping to provide reference value to some SaaS companies.

The company is a large provider of website building SaaS services with over one million users. This is a typical website traffic analysis scenario. The scenario business model is relatively stable, but its technical challenges are relatively large. As shown in Figure 2, the company started to use Apache Kylin to build a tool called Analytics Platform as early as 2017. Its capabilities include clickstream analysis, web page PV, UV, access device, source, etc. Classic customer traffic, website Behaviors include retained analysis scenarios and models. Due to the large number of global customers, and C-end users have extremely low tolerance for query response speed, most queries need to be returned within one or two seconds, which is also a common challenge faced by To-C SaaS providers when providing data services.

f216e163f06ea6a50f465d1154b3324f.png

Figure 2 SaaS enterprise pain points and demands analysis chart

In addition, after the user completes the website building, the backend data query and reporting service Analytics Platform will become an important touch point to improve user retention. Since users are mainly non-technical people, they need analysis tools that are easy to use and highly integrated with products. Third-party analysis tools are often more complex and require high learning costs, so users are more dependent on the Analytics Platform that comes with the platform. high. The operation and maintenance of providing such analysis services is also very difficult. In order to ensure that the service is not interrupted, continuous maintenance is required 24/7. To ensure user satisfaction and retention, the platform must ensure high stability of data services. Open source Kylin's tools and services will be relatively more dependent on the company's own technical capabilities in terms of reliability, requiring companies to continuously optimize the total cost (TCO). This requires enterprises to consider not only the cost of resources on the cloud, but also the cost of investing in big data technical personnel, which means that under traditional chimney construction, many data engineers are needed.

After evaluation and testing by the Kyligence service team, the company decided to migrate to the Kyligence Cloud platform. Its non-functional value advantage is shown in Figure 3.

2c758d6084d788d9fdee164c7af494ad.png

Figure 3 Comparison of scene architecture before and after

  • Unleash IT productivity. Business models can be automatically optimized through SQL queries. At any time during the use of the model, the design of the model can be flexibly adjusted manually, such as adding or subtracting relationship tables or analyzing dimensions and indicators.

  • Cost optimization. The traditional deployment method is Hadoop+Kylin on the cloud. The main source of overall operating cost reduction after deployment is Hadoop cluster optimization. The traditional big data layer of Hadoop is replaced by cloud-native architecture, which reduces a lot of hardware costs and a lot of operation and maintenance costs.

  • Effectively supports high concurrency. The pre-computation capabilities under the multi-dimensional model behind Kyligence Cloud can provide stable support. When query calculations are completed in advance, the calculation amount during online service can remain stable and has almost nothing to do with the original data amount.

To sum up, endowing enterprises with the capability of business digital models and realizing automated data services and management for enterprises is a non-functional value point that open source start-ups need to pay special attention to in addition to meeting their functional value requirements.

ed32d20c142cae00d98eb2c53dcb6ec6.png

Finding the right position is key

The development of open source technology must break through numerous technical obstacles, while open source entrepreneurship requires establishing capability boundaries and finding a precise positioning.

There are two situations in finding the correct positioning. One is to find out one's own advantages, and the other is to find the service target/market. We have talked about the talent issue before. In fact, potential customers are divided into two categories. One is technology-based industries, such as the Internet, automobiles, etc. This type of industry has its own technical backbone and is less likely to purchase technology from outside. Its corporate image is that of a technology-based company, and it will try to avoid technology purchases unless absolutely necessary. The other is traditional industries, which are positioned to solve industry problems, such as finance, energy, retail, etc. Its value is business value, so technology is a kind of support for it, and it is a kind of infrastructure. As long as the technology can really solve the problems of safety, stability and reliability, it is willing to pay for it. Therefore, starting a business needs to establish the most valuable non-functional part, that is, the company needs to find the correct positioning and find this part of the value-added advantage.

Since its inception, Kylin has always had relational database capabilities, and is often compared with other relational OLAP engines, but what really sets it apart is the multidimensional model and multidimensional database capabilities. In 2022, we conducted an in-depth review from the perspectives of Kylin's capabilities and advantages, open source and open source commercial version positioning and goals, industry trends and needs. As shown in Figure 4, considering the nature of Kylin and its wide range of business uses in the future (not only technical uses), the team clearly positioned Kylin 5 as a large-scale platform that integrates features such as unity, flexibility, high performance, scalability, and cloud native. Data analysis platform, where users can complete many data analysis, dock, support, replace multiple data sources, query interface and calculation engine, etc. Kylin will also become a solid and reliable base for massive data analysis and indicator management of enterprises, making big data understandable and affordable for ordinary people, and finally realizing data democratization.

eaa654bc616280d09922042fdd192d51.png

Figure 4 The orange area is the focus of Apache Kylin (Picture source: Apache Kylin)

In addition to product and technology positioning, customer service is also very important in the entrepreneurial process. The open source commercial version of Kyligence requires "stability first and security zero." Whenever a new security vulnerability appears, the company will sound a first-level red alarm, and the entire production and research side will mobilize all forces to solve the problem immediately, and inform the customer whether the security vulnerability is related to the customer's current production environment. If there is no direct impact, we will still conduct multiple reviews and prepare plans to nip problems in the bud. If there is any impact, we will respond immediately and resolve it.

To sum up, open source entrepreneurship needs to think more about "What is the core value of the enterprise? What problem does it help customers solve?" The most common misunderstanding is that one's core value is to provide customers with a technology that is not currently available. This understanding may be correct, but it must be short-lived. With full collaboration and information exchange in open source, technology will advance rapidly, and any new technology may be quickly caught up. Maybe you can think deeply about yourself

The value in the entire open source software ecosystem, the value that can attract users to pay, is usually not a functional point, but often a non-functional part. If you find this non-functional value, your open source business may become easier.

——————  Recommended Reading  ——————

"New Programmer·005: In-depth Guide to Open Source & The Technological Power Behind New Finance" specially planned two major topics: "In-depth Guide to Open Source" and "The Technological Power Behind New Finance". Pioneers in today's open source world are invited, including the father of Python, Guido van Rossum, the father of MySQL, Michael "Monty" Widenius, the father of Apache and general manager of the OpenSSF OpenSSF Foundation, Brian Behlendorf, MongoDB CTO Mark Porter, and Ningsi Chairman Gong Min. , Linux kernel guardian Wu Fengguang, etc., as well as representatives of domestic and foreign open source foundations and well-known enterprises, providing services for developers, enterprises, and other aspects behind open source from aspects such as open source security compliance, internal open source within enterprises, open source technology innovation, and the implementation of the open source industry. Open source organizations and open source communities provide a clearer panoramic picture of open source ecological construction and upgraded open source development.

In the financial topic, technical experts from more than a dozen traditional financial institutions and leading financial technology companies such as Industrial and Commercial Bank of China, Postal Savings Bank of China, China Guangfa Bank, People's Bank of China, Ping An Technology, WeBank, and Ant Group are We bring in-depth discussions and case studies on various next-generation disruptive technologies. Provide in-depth answers on how developers should better integrate into the financial industry, as well as how to cultivate talents in financial technology, and truly achieve technological innovation and digital transformation in financial technology.

930759cb2ea8608eeaf70bfee0837730.png

Welcome everyone to scan and subscribe to "New Programmer"

c56b7feb19bb641184230664c7112fe3.gif

The "2022-2023 China Open Source Developer Survey" has been launched. Welcome to scan the QR code below to participate in the "Open Source Developer" questionnaire survey that everyone is using. There are also exquisite gifts such as iPads waiting for you!5e0089d8def52c371bfa5228271b16b2.png

Guess you like

Origin blog.csdn.net/CrisAppleYan/article/details/128663132