Talk about open source (1)

Author Shen Li

In front of the source code, there are no secrets---- Hou Jie

foreword

Many people's "open source" is a relatively fashionable and sentimental word, and many companies also regard open source as a KPI or a means of technology promotion. But in our opinion, most people don't do a good job of open source, and most open source projects are not well maintained. For example, there was a discussion about Tengine on Weibo some time ago. An excellent open source project is not only OK to publish the source code, but also requires a lot of energy to maintain, including developing RoadMap, developing new functions, communicating with the community, and promoting the project. Use in the community, some level of support for users, etc.

At present, we have not seen any particularly good articles in China on how to operate an open source project, or how to do a top-level open source project. It has been more than two years since the establishment of the TiDB project. From the beginning of development, we have firmly followed the open source route, and successively open sourced the three core components of TiDB, TiKV, and PD, which have gained widespread attention. The project is in Trending on GitHub. The above has also appeared on the home page many times. In the past two years, we have accumulated some experience and lessons in this regard. Here, we will share with you some of our feelings in the process of open source and the correct posture for participating in open source projects (at least TiDB-related projects).

what is open source

Open-source software (OSS) is computer software with its source code made available with a license in which the copyright holder provides the rights to study, change, and distribute the software to anyone and for any purpose. ---- From Wikipedia

Open source discussed in this article refers to open source software. In short, open source is someone who owns the copyright to the source code, allowing others to access the source code and use it for some of their own purposes, within the scope of a certain license. The most basic requirement is that other people can access the source code. In addition, what can be done after obtaining the code requires a special license to regulate (it can be written by oneself or written by someone else). It generally stipulates matters such as modification of code, new code, whether open source is required for subsequent work, and patent-related matters. OK, let's write main.pya line with a line print "Hello World!"in it, and throw it on GitHub together with a license file, and we have an open source project that meets the minimum requirements.

Why open source

Many people think that code is the most valuable asset of a software company. What is the benefit for you to let others obtain these most valuable assets for free? What if your opponent takes your code and starts anew to compete with you? Or users directly obtain the source code and use it in their own environment, then how do you collect money? For a technology company, the most valuable asset is actually people. For an open source project, the core asset is an active open source community and the recognition of the project by others. Let's look at the impact of open source in these two aspects.

  • Branding Obviously, open source is a very good means of PR and branding. Most large companies do open source for this purpose. They can publicize the company name and establish a technology-based corporate image in a way of almost zero cost. A well-known and good corporate image is good for all aspects. For example, there is a well-known technology media abroad called HackNews. Our products have been on its homepage many times and have gained a lot of attention. In fact, those times were not our own posts, but the dissemination by others who paid attention to our products.

  • Talent acquisition The biggest difficulty in talent recruitment is how to identify the person's ability, whether he can work, and whether he passed the interview by brushing the questions. How to work with this person for a period of time and see how he completes his daily work, then the understanding of this person's abilities will be further improved. To accomplish this, the traditional method is for Some How to find someone who has worked with this person and listen to him. To do this depends on luck first. Sometimes it takes several layers of relationships to find such a person, and it is not necessarily the correct and true answer. But if this person has contributed some code to your project, and the quality of the code is relatively high, and the communication with you during the contribution process is very smooth, then on the one hand, this person has good soft and hard examples, and on the other hand, this person shows that he treats you well. Interested in doing things. TiDB has a large number of regular and intern employees who are converted from Contributor, so that we are worried that if we don't recruit all of them, the community will be gone :).

  • Community Contribution It can be said that without the open source community, the entire Internet would not be what it is now. Imagine if there are no such things as Linux, MySQL, GCC, Hadoop, Lucence, then the basic technology stack of the entire Internet will cease to exist (of course, there will definitely be another set of things, but it may not be as complete as the open source one) . Countless open source community contributors have contributed their own strengths to jointly maintain such a mutually beneficial community and support social and technological progress. We also get a lot of support from the open source community, including questions, suggestions, and code submissions from more than 140 contributors around the world. With the development of the project, I believe that the proportion of code contributed by the community will continue to increase.

  • Improve project quality When a project is operated as open source, the quality of the code is the face of the project. Everyone will be very cautious when submitting code or commenting on other people's PRs, because your every move can affect the whole world. See, after all, no one wants to show their cowardice in front of others, right?

  • Significance for basic software For basic software such as a database, the most important things are correctness, stability and performance. The first two points are especially important. To ensure these two points, on the one hand, it is necessary to improve the quality as much as possible during the development and testing process, and on the other hand, it is also very important to use it widely. Only when your product is tried by enough people, even used in a production environment, can there be enough feedback and product suggestions. After all, the tests that developers can do are limited, and many scenarios, environments or business loads are beyond our imagination. Feedback from actual users helps us improve product quality, and suggestions from users help us improve product usability. Only the basic software that has been running in the production environment for a long time can be regarded as qualified basic software.

Therefore, we believe that open source is the general trend of basic software, whether it is well-known products such as Hadoop, MySQL, Spark, or giants such as Linux Foundation, Apache Foundation, CNCF Foundation, all prove this point of view. At present, the popular open source projects of large domestic companies are also concentrated in the field of basic software, such as Baidu's Brpc, Palo, Tera, and Tencent's PaxosStore.

What projects does PingCAP open source

Here is a brief description of what our open source Repo does:

  • TiDB : The SQL layer of the database
  • TiKV : Distributed storage engine for databases
  • PD : the management node of the cluster
  • Docs : English documentation of the project
  • Docs-cn : Chinese documentation of the project. You can browse our code on GitHub and see our complete development process.

Development process in open source mode

A typical day of PingCAP Siege Lion Xiaoshen:

Wake up at 8:00, first log in to Slack to see if the test task that was scheduled to run last night was normal, and then pay attention to the various Channels on Slack, WeChat groups, and mailboxes to see if there is any important news

9:00 After washing up + eating breakfast, teasing my lovely daughter for a while (or maybe being teased by my daughter), then go to work

9:30 Arrive at the company and start working.

  • Open your computer and see what's new on GitHub
  • Check if your PR has been commented by others. If there is a comment, solve it as soon as possible; if no one has read it, at the relevant classmates and ask for a review
  • See if anyone else's PR needs to be reviewed by yourself, especially at your own PR
  • Put on your headphones and start writing some code
  • Slack someone at me, please reply quickly
  • There are people discussing issues in the Channels I follow on Slack, I am very interested, join in and discuss for a while
  • My colleague was going to make a new Feature and wrote a design document. I clicked in and read it again and mentioned a few Comments.

12:00 Shamefully hungry, call friends to dinner, and discuss technology and gossip on the way

13:00 Return from dinner, check emails, Slack, WeChat messages, and deal with urgent matters

13:30 Take a nap

14:00 End of nap, take a cup of coffee, start the afternoon work, the keyboard hits. . . . .

15:30 Participate in the design review meeting of colleagues, discuss the design plan with remote colleagues through the video conference system, and start working after making a decision

16:30 Take a break, then continue coding, Review PR

18:00 Most of my colleagues have gone to dinner, I am going to drive home for dinner

20:30 After eating, cleaning up, nothing to do, turn on the computer for a while to read emails, issues, PRs

22:30 Rest for a while, get ready to take a bath and sleep

How to do an open source project

First of all, you need to choose an open source protocol according to your own demands, business model, etc. The common ones are GPL, BSD, Apache and Mit. The difference between these open source protocols is explained very clearly in this blog by Mr. Ruan Yifeng. I recommend everyone to read it. .

After the agreement is selected, choose a code hosting platform. The current standard choice is GitHub. After registering a GitHub account and applying for an Orgnization, you can start using it. If you don’t need a private Repo, you don’t need to pay any fees.

Start code development, submit your first Commit, and finish writing a Readme (a good Readme is really important).

Subsequent development needs to be carried out through Pull Request, it is best not to push Master directly. A serious project needs to add the Master to the Protected Branch and prohibit direct Push.

In order to ensure that subsequent code submissions are all Work, it is best to integrate at least one CI service in GitHub, commonly used are TravisCI and CircleCI (CircelCI seems to always have problems recently). Then on the PR's settings page, the PR is required to pass the CI before it can be merged.

If someone finds some problems when trying the project, they will give feedback through Issue, so they need to pay attention to Issue and reply as soon as possible. In addition, it is a good practice to classify Issues by Label, so that everyone can quickly search and classify Issues. For example, we will mark some of the simpler issues as Help Wanted. If there are new students who join the community and want to start contributing code, then these issues are a good starting point.

When more and more people participate, some people will start to contribute code, and the Maintainer needs to review other people's PRs to ensure that the project's own code quality requirements and coding style are consistent.

Finally, a good project needs to be well-documented to help everyone use the project. Including architecture, brief introduction, detailed introduction, FAQ, usage examples, interface documentation, installation and deployment, and best practices, etc. This is what most projects ignore.

How to get involved in open source projects

try out

The easiest way to participate is to try open source projects, which is also one of the biggest benefits of open source. Everyone can try it at any time, which is equivalent to having many people helping the project authors to do tests. After all, if only the author does the test by himself, the environment, scenario, and application method encountered will be relatively simple, and there will always be problems in some places you can't imagine. So every question that comes out of the test is valuable, and we will evaluate and respond as quickly as possible.

报 Issue

During the trial process, you may encounter various problems, especially the problems not mentioned in the document. The best way to feedback the problem is to create a new issue on Github, so that everyone can see it, and give us feedback through the issue. It will also pay more attention, and someone will periodically scan the unhandled Issue. Of course, it's a good habit to search for an existing issue before creating an issue.

In the Issue, describe the problems encountered in as much detail as possible, and an actionable reproduction step, including the version of Binary used, the deployment method, the logs of the client and service, and the logs of the operating system (such as the output of dmesg). If it cannot be reproduced, please provide the Log in as much detail as possible. These can be very useful for developers to track down bugs.

make a suggestion

If you have any suggestions for the project, you can also give feedback by creating a new Issue. We will generally give you whether you will support it, and if so, when will it be supported.

PR

When you encounter problems with TiDB or need new features, and you feel that you have the ability to fix or the current official has no energy to fix, you can try to modify the code yourself to solve the problem.

At present, there are more than 140 contributors to the TiDB project, scattered in more than a dozen countries around the world. Many of them are deeply involved users.

If it is a small function or a simple bug fix, you can shout under the relevant issue to let everyone know that you are doing this thing, so that no one will do repeated work. If you encounter any problems during the process, you can also discuss with the Maintainer in the relevant Issue.

If you want to do a relatively large function, then it is best to have a round of discussions with the official, and then write a Design as detailed as possible. After the discussion is OK, start development.

talk about something fun

There are always more or less wonderful Issues in open source projects , such as this . It is really shocking to see this Issue .

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324886287&siteId=291194637