Hackers and Customers: Can Open Source Software be Commercialized?

4edf9572adac344dd65412a4e56d1222.gif

9edcd9b425fbaade0a61619752cbd628.jpeg

After burning free time for the past week, I have finally customized my own Linux Desktop environment.

Although Linux is often compared with Windows and Mac OS X, Linux itself is just an operating system kernel. In order to meet the needs of users as a daily desktop system, many additional functions need to be added. For example, Linux as the kernel only provides the ability to identify the network card and communicate with it. Specifically, how to establish and maintain a network link, how to automatically connect when booting, and how to set up a network agent are different from Windows and Mac OS X, which have mature and friendly graphical management interfaces. , Linux users need to customize their own configuration. Similarly, the same is true for partitioning the hard disk, formatting the file system, setting the bootloader, installing the graphics card driver, and configuring the input method, which need to be customized by Linux users.

This kind of customization is not simply implemented in one way, and everyone has a consistent experience. Due to the separation of mechanism and strategy in the UNIX philosophy, and because a large amount of software in the Linux ecosystem is free software, users often find that "there is more than one way to solve a problem". Even to the extreme, there is more than one "framework for solving problems". Taking the input method as an example, the input method framework of the Linux ecosystem includes at least two types: IBus [1]  and Fcitx5 [2]  . Only the framework is obviously not enough to fully use the input function, so the corresponding input method implementation, the integration of input methods under different graphical interfaces (frameworks), and thesaurus and other support to supplement the input method capabilities are also required.

Such an environment gave birth to two major characteristics of the Linux ecosystem.

One is that software in the Linux ecosystem tends to be open source. This is not only the appeal brought by the birth of GNU/Linux, but also the best choice for a wide variety of ecology. For example, the Sogou input method, which is popular on other desktop systems and mobile phone systems, only supports Ubuntu series systems supported by Xinchuang Power in the Linux ecosystem, and only provides deb installation packages and is only compatible with the Fcitx input method framework. When the upstream of Ubuntu tried to change the default input method framework from Fcitx to Fcitx5, Sogou, a company, had no manpower support, nor did it open source code to allow community support, and finally delayed the upstream update plan[3 ] . On the other hand, the open source Rime input method got the support of community enthusiasts immediately, and the community co-construction completed the integration of the new framework.

The second is that there are a huge number of distribution providers for Linux. As mentioned earlier, in order to build a complete desktop environment or server based on the Linux kernel, a series of customizations are required. Although this kind of customization gives users great freedom, so that there is not only one solution for almost every problem, but many times, users just want a solution that can meet their personal needs in the desktop environment or production needs in the server environment. , an integrated system like Windows or Mac OS X. The main job of the Linux distribution is to package a series of application software and tool chains to provide users with an out-of-the-box "operating system".

Among all Linux distributions, the major shares are Arch Linux [4]  and Ubuntu [5]  .

The Ubuntu distribution is commercially supported by Canonical, provides a user-friendly graphical installation interface, and designs a special out-of-the-box route for early adopters, personal desktop users, and enterprise server users. Since Ubuntu has customized a considerable part of the application framework and application software, other distributions of the Ubuntu series are more like different brands providing some of their own bundled software after pre-installing the Windows operating system.

In contrast, the completely community-maintained Arch Linux distribution is minimalist. It only provides the most basic installation program and AUR central software library, supplemented by comprehensive Wiki documentation to help users customize the Linux system by themselves. In addition to using the AUR software library (even Manjoro Linux has chosen to build its own software library), the distributions of the Arch Linux series can be very different from the graphical interface to the selection of each functional software.

From the above examples, we can see that the user groups of open source software are clearly divided into two categories: one type uses the Arch Linux distribution, pursues free use and modification of the software, and also has the willingness and ability to customize, we can They are called Hackers; the other category is those who use the Ubuntu distribution and expect the provider to deliver a package solution directly, or those who need Party B to cover the business at the commercial level, we can call them Customers.

The commercialization of open source software is mainly aimed at the needs of customers, and charges are made by satisfying customers' needs. Hacker groups often only contribute to ecological prosperity and do not directly participate in the process of commercialization.

hackers need

Whether it is the free software movement or the open source movement, the basic ideas include the following two points: allowing users to use and distribute freely for any purpose; allowing users to obtain the source code, improve their own interpretation and release the modified version.

For hackers, the first point is the premise of being able to use free software or open source software, needless to say; but the second point is to be able to obtain the source code and use it after modification, which is also crucial.

For example, when @fkxxyz used @VOID001's software ssf2fcitx [6]  to use the Sogou Pinyin skin for the input method under the Fcitx framework, he found some bugs, and ssf2fcitx did not support the new Fcitx5 framework, so he modified it by himself ssfconv [7]  to meet your own needs.

It is worth noting that there is such a statement in the README of ssfconv:

So, I looked at his source code and found that the logic is quite simple, and then I looked at the various formats of fcitx's custom skin, and planned to study it myself. Finally, I plan to refer to this project and write one myself in python. Since fcitx5 also supports themes, the conversion to fcitx5 themes is finally realized!

Such motivation and the freedom to modify software to suit one's needs are typical of hackers.

For example, Apache Doris was originally deduced by a Baidu R&D team based on the code of Apache Impala, and Apache Kvrocks (Incubating) was originally a persistent version of Redis rewritten in C++ by the basic software team of Meitu. Xiaomi encountered various practical needs in the process of using Apache HBase. Most of the code written to solve these needs entered the upstream in a vendor-neutral form, and a small part of the company's unique logic was retained for internal implementation. An e-commerce company modified an internal software that supports message queue semantics based on Apache Kafka to support business needs.

The secondary development activities of these internal R&D teams based on open source software are also in line with the motives of hackers. For the upstream community, the use and secondary development of enterprises in the production environment are of great benefit to the reputation of open source software, defect repair and ecological development. However, once an enterprise chooses to hire a hacker team to develop and maintain based on open source software, the possibility that such an enterprise will purchase services based on upstream software produced by other providers is greatly reduced. When I was working for a company that built a flow computing platform based on Flink, I encountered a user company’s experience in selecting a communication platform. As a result, there were more people in a platform R&D team than we counted support staff. The organizational structure makes it impossible to procure the provider's services.

customer needs

Unlike hackers who are full of exploratory spirit and hands-on ability, customers' needs are often simple and direct solutions to problems. Of course, it is entirely possible for a person or organization to be a hacker in one respect and a customer in another.

For example, although the e-commerce company mentioned above maintains a team for secondary development of the message system, it still directly uses the RDS service of the cloud vendor for the selection of the database. For example, many hackers who have spent a lot of time tinkering with the customization of the Linux system will also pay for the Windows system or Mac OS X system for specific life and work needs, or simply to save trouble. Wang Yin, who once wrote "Working Completely with Linux" [8] with enthusiasm , no longer tosses, and then exposes the "user-friendliness" most needed by customers with an article "Talking about Linux, Windows and Mac" [9] .

When customers choose between open source software and proprietary software, they don't care much about whether they can obtain the source code and whether they have the right to freely deduce and distribute it. What the customer needs is a piece of software or a complete set of services that can directly solve his problems. Free is the best, but it can also be considered if it is not free, as long as Party B can solve the problem after paying the money.

For example, in a chat to exchange database selection, a grassroots decision-maker commented that using TiDB can directly solve the advantages of using TiDB to directly solve the data volume exceeding stand-alone MySQL and provide basic AP query requirements with "open source + easy". I asked him along the way, if open source is mentioned here, is it important to open source or free for commercial use. He answered without hesitation that free commercial use is important. Another participant added that "it matters who the pot is".

This leads to the first commercialization model for customer needs: providing consultation and subscription.

The famous Red Hat company started with a subscription model. Since Linux is released under GPLv2, Red Hat's Linux-based distributions are open to kernel modifications. Since Red Hat is only a participant in the Linux ecosystem, and the core productivity of Linux is upstream, Red Hat can only cooperate with the upstream community in the direction of major technologies. In this case, the business model chosen by Red Hat is to provide subscription services oriented to the needs of enterprises, addressing version stability, after-sales support, and maintenance commitments that enterprises value but are not provided by the upstream community.

Later, Canonical followed suit, aiming at individuals who are curious about the Linux operating system, as well as enterprises that need Linux server production and deployment, and tailor-made one-click startup paths and after-sales support. On the Ubuntu installation page, you can choose to install the early adopters version of Ubuntu based on Live USB, and start your personal experience without involving the host; you can also choose the configuration installed in the production environment of the enterprise. Ubuntu meets the compliance requirements of various countries and regions Made specially customized. Some domestic Linux distributions of Xinchuang, such as Ubuntu Kylin, Deepin, and Tongxin UOS, are all modified based on Ubuntu distributions, which shows the efforts made by Ubuntu for enterprise use.

Let's go back to Red Hat. In addition to the Linux subscription service it started with, in recent years it has also launched an OpenShift solution that can be counted as a Kubernetes distribution. Similarly, OpenShift also focuses on out-of-the-box use and comprehensive after-sales support for enterprise users. For customers, they want to catch up with the cloud-native technology revolution brought by Kubernetes, but cannot afford to maintain a top-level technical team. They also hope that Party B can fully understand the problems that arise during use and choose OpenShift or other Kubernetes releases. Edition is an option worth considering. The same principle can also be applied to customers who want to take the Istio express and choose Tetrate's Istio distribution or even a complete cloud-native application network framework.

In the latter two cases, half are in the category of consulting and subscription, and the other half are the second path of open source commercialization based on open source software to provide industry solutions.

Providing industry solutions based on open source software means that open source software is no longer the core of product sales, but at most a brand that stands up to its reputation and a base for deep dependence. The most successful example in the industry should be a series of industry solutions developed by Databricks based on Apache Spark. At first, Databricks cooperated with Microsoft Azure, which is a slow step into the cloud field, and bundled a performance-optimized distribution of Apache Spark on Azure. Subsequently, Databricks took AI as an entry point, launched integrations such as SparkR and PySpark in the Spark ecosystem, and provided standardized computing resources and package solutions for typical AI computing tasks on the commercial product side. Later, the exploration of Alluxio and JuiceFS made Databricks discover the huge demand market after Spark's computing power and storage, so it developed Delta Lake software by itself, and proposed the concept of integrated lake and warehouse to sell new solutions. As the company gradually gains a foothold, Databricks also has more resources to invest in the AIOps concept that is now popular on the previous AI front, as well as a series of peripheral products, including Delta Live Table, needed to perfect the entire story of integration of lakes and warehouses .

Although Databricks has squeezed toothpaste to open Delta Lake's code in response to the impact of open source data lake software, but in terms of the company's overall development trajectory, the more commercially driven products and services are, the less it is related to open source software. Up to now, Apache Spark is just a basic building block hidden deep in the most profitable products and services, an efficient computing framework, but the logic of users paying for commercial products and solutions is no longer just whether you use Apache Spark or not.

Advantages of open source

It should be said that when customers make a choice, they will not treat the product differently because the product is developed based on open source software. If the product itself is a simple package of open source software, then enterprise customers are more likely to challenge the meaning of payment. After all, putting VS Code in a compressed package and selling it on CSDN can deceive some retail investors, but the same logic is difficult to work in the face of enterprises.

Based on the commercialization of open source software, its competitors include proprietary software. Customers will put out all possible options and weigh the pros and cons to make a judgment. Because customers generally don't care about the availability of source code, they expect Party B to solve the problem. However, open source software is free for commercial use, but it often becomes a challenge for sales to answer why they pay. On the other hand, there are businesses that do make money based on open source software. So what are the advantages of the open source approach to commercialization?

Users of open source software can try it for free, which is helpful to lower the user threshold and increase the share of long-tail users. The specific logic is similar to the freemium business model, so I won’t repeat it here.

The unique core advantage of open source is the formation of a huge ecosystem.

Cloudera's main commercial product, CDH, can deploy and maintain a complete Hadoop-based big data suite with one click, solving complex configuration and version compatibility issues between multiple big data components for users. But why would customers want a Hadoop-based big data solution? This has to be traced back to the beginning when Apache Hadoop appeared as open source software, attracting a large number of hackers to join and gradually build a complete ecosystem. The openness of this ecology has incubated peripheral technologies such as HBase, Hive, and Impala. Even Spark and Kafka were originally designed for data stored on Hadoop. Vendors that provide similar services to CDH abound, and almost every cloud vendor except Microsoft and Google will make one. In contrast, GFS and MapReduce, which are the source of Hadoop technology, have always been bound to the Google platform. The evolution of technology can only rely on Google, and the only potential customers are companies that are determined to go to Google.

Open source is an important condition for ecological prosperity. The emphasis on source code in the free software movement and the open source movement is fundamentally a requirement for sharing the logic of software operation, that is, the knowledge contained in software. The Free Software Foundation's explanation of free software [10] mentions that users should have the freedom to understand the logic of software operation, and the freedom to make changes and distribute modified versions after understanding. Obtaining the source code is a necessary condition.

Just imagine that if Hadoop is not open source, then the key data storage format and network communication protocol have to be guessed, not to mention that proprietary software is usually prohibited from cracking, which puts the developers who really crack it at legal risk, and companies can change it secretly at any time Format, or the introduction of provider-bound encryption/verification logic, the cost of third-party maintenance ecological integration or expansion has risen sharply and eventually abandoned. Even if the distribution company of proprietary software does not do this, but the company has the power and ability to do so, the potential risks are enough to dissuade all third parties.

Ecological prosperity is an important consideration for customers' choices. The reason why customers choose RDS services based on MySQL is that there is a sufficient supply of talents familiar with MySQL. Based on compatibility with MySQL, TiDB gradually spreads TiDB's operation and maintenance experience among the DBA community. The open source code enables a large number of operation and maintenance integration tools to be freely created and shared in the communication process of hackers, which consolidates the ecology of TiDB and strengthens customers' trust in TiDB as a reliable solution.

StreamNative's investment in Apache Pulsar is another prime example. In the process of serving customers, StreamNative implements some common functional requirements as open source, including the deployment of Pulsar clusters in the form of Helm Chart, the integration of a series of different systems, and the development of AMQP, RocketMQ and Kafka based on the message protocol processing framework. Client protocol support. After the source codes of these software were released, they received trial and feedback from a large number of users. Among them, hacker-type users can accurately report vulnerabilities and even repair them by themselves and give back to the upstream. In this way, multiple parties participate in polishing the ability of the Pulsar ecosystem to solve various problems. As a Pulsar service provider, at least it has a huge advantage in persuading customers to choose Pulsar.

It is worth mentioning that open source code is not always in the interests of enterprises, so when enterprises have the right to choose, it is worth considering whether to open source code or not.

In the case of StreamNative, the main purpose is to make Pulsar gain a firm foothold through ecological prosperity, market share and customer confidence in using Pulsar are the key, and users currently rarely pay for integration, so open source is a positive benefit. Subscribing to StreamNative's services is also the most affordable for customers who must use the corresponding integration.

Let’s look at another example of data integration. At first, Airbyte also focused on completely open source to create market momentum, but after it won a part of the market, it found that other manufacturers copied the Airbyte source code to provide homogeneous services, and immediately changed the Airbyte protocol to Elastic. License 2.0 to prohibit competition. However, Airbyte still licenses the data integration module and command-line tool code under the MIT license. This means that Airbyte prevents commercial competition by prohibiting other manufacturers from copying its server code and UI code, and at the same time opens up the integration module to encourage various software as data sources and data sinks to integrate with itself.

As mentioned above, one of the reasons why users choose CDH is to avoid being bound by a single provider. However, from the provider's point of view, it is optimal to bind users to their own products and services. We can gradually see that companies like CockroachDB and Akka that have the ability to change software licenses will almost certainly choose to modify software agreements to prohibit commercial competition when business development is impacted. Considering that the development of open source software is often the result of the joint efforts of the community, such behavior is actually harvesting the reputation of the community. For a detailed discussion on this aspect, you can read my earlier article " Pseudo-Open Source Strategy to Induce Turning ".

On the other hand, companies like Red Hat, Confluent, and StreamNative, which rely on open source software to develop business models, have no ability to change the open source agreement of upstream software, and because the founding work is not completed after the establishment of the company, it is itself a beneficiary of open source Or, there is not much entanglement.

In the future, "open source commercialization" companies will basically be divided into companies that provide commercial products and services based on existing open source software, and companies that fully own intellectual property rights and selectively open software source codes.

Under the condition that the company has the intellectual property rights of the software, after demonstrations by companies such as MongoDB and Elastic, the number of companies that want to directly compete in business should be greatly reduced. Without direct competition, companies will not deliberately replace the open source agreements that have been advertised with open source agreements that prohibit commercial competition. In this way, the founding company is the only one, and the non-commercial competition of hackers and users is jointly developed, which will be the ultimate survival form of this type of enterprise and software.

Intellectual property rights are held by a neutral third party, which is typically the case of various open source software foundations. Enterprises will survive in a prosperous ecology and customers have less worry about being tied to suppliers, and in a more intense business competition environment. A typical path is: to survive by providing subscription services and making releases when one’s influence on the upstream is weak (Red Hat) or the technology gap has not formed (Databricks); then, if one can open one’s own The company's enterprise-level solution product line, open source software has become an excellent technical base and a source of continuous reputation for the company, and proprietary software in enterprise-level solutions, or the only expert team in the market, will be unmatched by other competitors. Barriers to cross; if it is difficult to establish a technical gap, provide a more comprehensive consulting subscription service (IBM) around the ecology.

References

[1] IBus:  https://wiki.archlinux.org/title/IBus
[2]  Fcitx5:  https://wiki.archlinux.org/title/fcitx5
[3]  update plan:

https://bugs.launchpad.net/ubuntu/+source/language-selector/+bug/1928360
[4]  Arch Linux:  https://archlinux.org/
[5]  Ubuntu:  https://ubuntu.com/
[6]  ssf2fcitx:  https:// github.com/VOID001/ssf2fcitx
[7]  ssfconv:  https://github.com/fkxxyz/ssfconv
[8]  "Working completely with Linux":

 https://dywang.csie.cyut.edu.tw/dywang/download/pdf/linux-wangyin.pdf
[9]  "Talking about Linux, Windows and Mac":

https://www.yinwang.org/blog-cn/2013/03/07/linux-windows-mac
[10]  Explanation of free software: 

https://www.gnu.org/philosophy/free-sw.en.html

Author | tisonkun

Editor | Wang Mengyu

Mutually

close

read

read

f068f5f124828d82fd0da9f19653f1f4.png

software freedom to free software

171a9f892be8223475b9e3717ef675de.jpeg

Official Announcement: Computing Middleware Apache Linkis Officially Graduated as Apache Top Project

0950dc920f38b832db0911e72b8dcd7f.jpeg

GitHub surpassed 14,000 Stars, another top Apache open source project in China was born!

outside_default.png

Introduction to Kaiyuanshe

outside_default.png

Founded in 2014, the Open Source Club is composed of individual members who voluntarily contribute to the open source cause. It is formed according to the principles of "contribution, consensus, and co-governance". It has always maintained the characteristics of vendor neutrality, public welfare, and non-profit. International integration, community development, and open source projects" is an open source community federation with the mission. Kaiyuanshe actively cooperates closely with communities, enterprises and government-related units that support open source. With the vision of "Based in China and Contributing to the World", it aims to create a healthy and sustainable open source ecosystem and promote China's open source community to become an active force in the global open source system. Participation and Contributors.

In 2017, Kaiyuanshe was transformed into an organization composed entirely of individual members, operating with reference to the governance model of top international open source foundations such as ASF. In the past nine years, it has connected tens of thousands of open source people, gathered thousands of community members and volunteers, hundreds of lecturers at home and abroad, and cooperated with hundreds of sponsors, media, and community partners.

ae80a9bc3c696905861dc93a32eabf7b.gif

Guess you like

Origin blog.csdn.net/kaiyuanshe/article/details/129002087