(Reproduced) Correct posture for using open source projects

Click to open the link

original post

There is a popular principle in the field of software development: DRY, Don't repeat yourself, we translate it more vividly: don't repeat the wheel. The main purpose of open source projects is to share, in fact, it is to prevent people from reinventing the wheel, especially in a rapidly developing field such as the Internet, where speed is life. The introduction of open source projects can save a lot of manpower and time, and greatly speed up business development. ,Why not do it?


However, the reality is often not so good. Although open source projects save a lot of manpower and time, they also bring many problems. I believe that most of the students have stepped on the pit of open source software, and the small impact may be downtime for half an hour. A big problem could be the loss of hundreds of thousands of data, or even a catastrophic accident where all data is lost.


In addition, although the DRY principle is there, in fact, open source projects are the ones that do not abide by the DRY principle. There are a lot of repetitive wheels, especially the crooked nuts. When you see which open source solution is unhappy, you will start a similar one. The: you have MySQL, I have PostgreSQL; you have MongoDB, I have Cassandra; you have memcached, I have redis; you have Gson, I have Jackson; you have Angular, I have React. In short, looking around, in fact, there are many similar wheels! There are too many similar wheels, and the choice is a headache.


How to do? It is almost impossible to not use open source projects at all, we need to be more intelligent in choosing and using open source projects. Image point: don't reinvent the wheel, but find the right wheel! You drive a Porsche, don't look for tractor wheels.


Next, I will summarize some experience and lessons on "how to use open source projects correctly" based on my 5 years of experience related to open source projects after joining UC. Some of the projects are from my personal experience, some from my experience, and some from my observation. Some of the details may not be completely accurate. You can discuss them based on your own experience.


The following content is mainly described in three parts, namely "select", "use" and "modify".


Selection: How to choose an open source project?


Does the focus satisfy the business?


640?wx_fmt=gif


When we choose open source projects, a headache is that there are many similar open source solutions, and the latter always claim to be more powerful than the former. We are a bit at a loss when choosing, and always worry about choosing plan A and missing plan B, or vice versa. Our experience here is to focus on whether it satisfies the business, and does not need to pay too much attention to whether the open source solution is awesome.


Case: When trying a social business, we discovered TT (Tokyo Tyrant), an open source solution. We felt that it could replace Memcached as a cache and have persistent storage functions. It could replace MySQL. It is widely used in business. But the subsequent use process is very painful, mainly as follows:


1. It cannot completely replace MySQL, so there are two storages, and it is necessary to discuss and make decisions every time when designing

2. The function looks very high, but there are many corresponding bugs, and some bugs are fatal. For example, all data is unreadable. Later, I researched the source code and wrote a tool to restore some data.

3. The function is really awesome, but it takes a long time to get familiar with various details


Later, we reflected and concluded that in fact, the business Memcached + MySQL at that time was completely satisfactory, and everyone was familiar with it. The business at that time did not need to introduce TT at all.


Simply put: if your business requires 1000 TPS, there is no difference between a 20,000 TPS and a 50,000 TPS plan. Some people may be worried that my TPS keeps rising, what should I do? In fact, don't worry, our architecture will continue to evolve. When we really need such a high level, we will restructure the architecture. Remember: don't optimize prematurely, premature optimization is the root of all evil  - "UNIX Programming Philosophy"


Focus on maturity


Many new open source projects tend to claim to be more powerful than previous projects: higher performance, more functionality, and more new concepts. They all look alluring, but in fact they all hide a negative problem, intentionally or not: they are more immature! No matter how talented programmers write projects, there will always be bugs. Don't think that the author has no bugs. The developers of Windows, Linux, and MySQL are all top developers, and there are many bugs.


The application of immature open source projects to the production environment is extremely risky. In light of the downtime, in severe cases, it cannot be recovered after restarting, and even more serious is that the data lost cannot be recovered. Let's take the TT mentioned above as an example: we really encountered a fault that the files were damaged after an abnormal power failure, and the restart could not be restored. Fortunately, we made backups every day, so we could only use the data from 1 day ago to restore , but all the data for that day is lost. Later, we spent a lot of time and manpower to look at the source code, and wrote tools to restore some data. Fortunately, these data are not financial-related data, and it is not a big problem to lose some of them, otherwise, there will be big trouble.


Therefore, when choosing an open source project, try to choose a mature open source project to reduce risks.


Maturity can be checked from the following aspects:


1) Version number: It is generally recommended that unless there are special circumstances, do not choose the 0.X version, at least choose the 1.X version, the higher the version number, the better.

2) The number of companies used: Generally, open source projects will list companies that have adopted their own projects on the homepage. The bigger the company, the better, and the more the number, the better.

3) Community activity: see if the community is active, the number of posts, the number of replies, the speed of problem handling, etc.


Focus on operation and maintenance capabilities


When we choose open source projects, we basically focus on technical indicators, such as performance, reliability, and functional solutions, and hardly pay attention to operation and maintenance capabilities. However, if you want to apply the solution to the online production environment, the operation and maintenance capability is an indispensable part. Otherwise, once there is a problem, the operation and maintenance, R&D, and testing can only stare blankly. I pray for the blessing of the Buddha!


The operation and maintenance capabilities can be examined from the following solutions:


1) Whether the open source solution log is complete: some open source solution logs only have a few lines of start and stop, and there is no way to troubleshoot the problem.

2) Whether the open source solution has maintenance tools such as command line and management console, and can see the situation of the system when it is running

3) Whether the open source solution has the capability of fault detection and recovery, such as alarming, switching, etc.


Use: How to use open source solutions?


Deep research, careful testing


640?wx_fmt=jpeg


Many people use open source projects, but in fact it is a complete "bringing". After watching a few demos, running the program and deploying it to online applications. It's like reading a driving guide, knowing that the steering wheel is steering, the accelerator is accelerating, and the brake is decelerating, and then driving on the road is actually very dangerous.


Case: We have a team that uses elasticsearch. Basically, it is used as it is. It is not clear what the inverted index is. The configuration uses the default value, and it goes online after running. The abnormal node is too slow, causing the entire site access to hang.


Case 2: When many teams first used MySQL, they did not do much research. Business departments often complained that MySQL was too slow. In fact, after positioning, it was found that the most critical parameters (such as innodb_buffer_pool_size, sync_binlog, innodb_log_file_size, etc.) were not configured Or misconfigured, the performance will of course be slow.


Research and testing can be carried out from the following aspects:


1) Read through the design documents or white papers of open source projects to understand their design principles

2) Check the role and impact of each configuration item and identify key configuration items

3) Perform performance tests in various scenarios

4) Carry out a stress test, run for several days in a row, and observe the fluctuation of indicators such as cpu, memory, disk io, etc.

5) Carry out fault test: kill, power off, unplug the network cable, restart more than 100 times, switch, etc.

 

‌Apply with care, release in grayscale


If we do the above "in-depth research and careful testing" and find that there is no problem, can we safely apply it online? Don't be too happy, no matter how in-depth your research and testing are, you still have to be careful, because no matter how in-depth research and careful testing, you can only reduce the risk, but it is impossible to completely cover all online Scenes.


Case: Let’s take TT as an example. In fact, we specially arranged a Daniel to read the source code and do the test before the application. We did it for about a month, but we still encountered various problems when going online. The complexity of the online production environment is really not covered by tests, so you must be careful.


Therefore, no matter how in-depth the research is, how carefully the test is, and how much self-confidence is bursting, you must always be in awe of the line and be careful with the ten thousand year ship. Our experience is to use it in non-core businesses first, and then slowly expand after experience.


be prepared, just in case


640?wx_fmt=jpeg


Even if our previous work is very complete and sufficient, we can't think that everything is all right, especially when we are just starting to use an open source project. If we are unlucky, we may encounter a bug that users all over the world have never encountered before. As a result, the business cannot be recovered, especially in terms of storage. Once a problem occurs, the failure to recover may be a fatal blow.


Case (this case is heard): A certain business uses MongoDB, and as a result, part of the data is lost after the downtime, which cannot be restored, and there is no other backup. There is no way to restore it manually. Only one user can complain and deal with one, resulting in DBA and Ops have since objected to our use of MongoDB, even tentatively.


Although it is a bit overreacting to completely oppose the attempt because of one failure, the failure also reminds us that for important business or data, when using open source projects, it is best to have another mature solution for backup, especially is the data store. For example: If you want to use MongoDB or Redis, you can use MySQL as backup storage. Although this is more complicated and costly, it can save lives at critical moments!


Change: How to do secondary development based on open source projects?


‌Keep it pure, pack it


When we find that some parts of the open source project do not meet our needs, there will naturally be an urge to change it, but how to change it is a university question. One way is to put a few people in to do it all from the inside out and make it exactly what our business needs. But there are several serious problems with doing this:


1) The investment is too large. Generally speaking, for an open source solution at the level of redis, if you really want to change it yourself, you need to invest at least 2 people for more than 1 month.

2) Lost the ability to follow the evolution of the original solution: if there are too many changes, even if the original open source project continues to evolve, we cannot merge it because the difference is too great.


So our suggestion is not to change the original system, but to develop auxiliary systems: monitoring, alarming, load balancing, management, etc. Taking Redis as an example, if we want to increase the cluster function, we should not change the implementation of Redis itself, but add a proxy layer to implement it. This is what Twitter's Twemproxy does. After Redis reaches 3.0, it provides the cluster function itself. The original The solution is simply to switch to Redis 3.0. For details, please refer to (http://www.cnblogs.com/gomysql/p/4413922.html )


What if I really want to change to the original system? Our suggestion is to directly raise requirements or bugs for open source projects, but the disadvantage is that the response is relatively slow. This depends on the urgency of the business. If it is too urgent, you can only change it yourself, but it is not too urgent. It is recommended to make backups. or emergency measures.

  

Invent the wheel you want


640?wx_fmt=jpeg


This point is estimated to have surprised many people. How can I talk about it for a long time and finally return to "reinventing the wheel you want"?


In fact, to choose or not to choose an open source project, the core is still a question of cost and benefit. It does not mean that choosing an open source project is necessarily the best solution. The main problem is: there is no wheel that is completely suitable for you!


The biggest difference between the software field and the hardware field is that there is no absolute industrial standard in the software field. Everyone is very happy and can play whatever they want. Unlike the hardware field, if you build a wheel with a different size, other cars will not be able to use it. No matter how high the craftsmanship of your wheels is, no matter how good the quality is, it is in vain; many similar wheels can be built in the software field, and they can basically be used everywhere. For example, if you change the cache from Memcached to Redis, there will not be too many problems.


In addition, in order to be able to apply on a large scale, open source projects consider general processing solutions, but different businesses are actually quite different, and general solutions may not be perfectly suitable for a specific business. For example, Memcached provides clustering functions through consistent hashing, but in some of our businesses, if one of the caches goes down, the entire business may be slowed down, which requires us to provide the function of cache backup, but Memcached does not, and Redis had no cluster function at that time, so we invested 2~4 people and spent about 2 months to build a cache framework based on the principle of LevelDB to support the functions of storage, backup and clustering, and then based on this framework. The function of synchronization across computer rooms has been added, which greatly improves the level of service availability. If we fully adopt the open source solution and wait for the open source solution to realize it, it is impossible to be so fast, and it is even possible that the open source project does not support our needs at all.


So, if you have money and time, investing in manpower to reinvent the wheel that perfectly fits your business is also a good choice! After all, many local tyrants (BAT, Facebook, Google, etc.) do this, otherwise we would not have so many useful open source projects :)



640?

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324462980&siteId=291194637