The practice of simple system optimization thinking

Author: Yan Xiaonan, JD Logistics

1. Problems

In the middle of last year, a good friend of mine (you can call him Hua Ge) started a business with passion despite the serious epidemic situation at that time. He spent a huge sum of money to contract several stalls in the cafeteria of his original company, and suddenly became the boss. Hua Ge, who became the boss, did not slack off in the slightest. He not only did sufficient market research, but also made innovations based on his own pain points when eating in the past. Stir-fried vegetables are sold in portions, about a dozen yuan for a meat dish, and nearly ten yuan for a vegetarian dish, which leads to a problem. Generally, boys can only eat 2-3 dishes after spending tens of dollars. The nutrition is not rich enough. If you step on the pit and encounter a situation where you have high expectations but find that the actual food is not delicious, the experience will be even worse.

Therefore, Brother Hua borrowed the characteristics of the optional weighing mode of Malatang on the market and launched a self-selecting and weighing mode. There will be many kinds of ready-made dishes (both meat and vegetables) on the dining table, and everyone can choose according to their own preferences. You can order your own food. The staple food of rice and steamed buns is free, porridge and soup are also free, and then some staple foods such as sweet potatoes and corn are provided for a fee. Staple food, then choose soup and porridge, and then checkout and swipe the card, as shown in the picture below:

Hua Ge is worthy of being the gold medal product manager of the former Internet giant. He keenly grasped the pain points of users and gave corresponding solutions very well. The self-service weighing mode has been warmly welcomed by colleagues since its launch. , There is a long queue every time, even when the meal starts at 11:30 noon, many colleagues are waiting in line before 11:20, I want to give you an intuitive example of a long queue as I write here Impression, the first example I thought of was the long queue at the entrance of the jujube cake shop in Wudaokou. Later, when I thought about it, the line of crock pots in the restaurant on the 4th floor of Block B, No. 2 Jingdong Building seemed more appropriate.

Brother Hua was very happy at the beginning, but a month later, he made an inventory of the revenue and found that something was wrong. Although it seemed that the queue was long and crowded, the actual revenue was not as expected. Brother Hua analyzed and ruled out the factors of low order price. In the self-selection mode, there are many dishes that everyone wants to order after seeing them. After coming and going, they will make a big plate. Basically, the order price starts at 20 yuan; and then there are only orders left. The possibility of low quantity is a possibility. After actual analysis, it can be found that because there are many varieties of dishes to choose from, it takes a long time for each person to choose dishes in the vegetable selection process. In addition, soup and rice need to be prepared. It takes a long time to complete the entire process of picking up meals. Although the colleagues behind can order dishes serially behind the colleagues in front, but because everyone's preferences are different, everyone's stay time in front of different dishes is different. This leads to the fact that when the colleagues in front take a little longer in front of a certain dish, the colleagues behind are in a waiting state.

And sometimes there will be some extreme situations, for example, some colleagues will stay in front of some of his favorite dishes for a long time to pick and choose, and some colleagues will alternate clockwise and counterclockwise while ordering free soup Stirring frantically in an attempt to pick up the scattered vegetable leaves and egg whites in the soup, Brother Hua witnessed with his own eyes that the manager he had reported to before turned over and over in the chili chicken vegetable bowl looking for the hidden ones hidden in the depths of the pepper. A little bit of chicken, every time a piece of chicken is found, the manager will show a satisfied smile on his face. In fact, when the manager is picking chicken, the whole team is actually in a state of complete stagnation, so looking at the performance of the whole team The execution of the dining process is very slow, which leads to the fact that the number of people who actually pay for the meal is not as many as expected.

Two, the program

Later, at a gathering of friends, Brother Hua chatted with me about this matter, and he asked me: Don’t all of you technologists brag about system optimization, cost reduction and efficiency increase, please help me find a way. After listening to Brother Hua's slightly provocative request, I suddenly felt a heavy sense of responsibility in myself, and felt that I had to uphold the dignity of a technical person. So I thought about it for a while, and then I felt that this business problem can actually be regarded as a technical problem. The dining table can be regarded as a system, and the process of ordering meals can be regarded as an interactive process of the system. It can be regarded as a call, because the performance of each call is too poor, resulting in too low throughput of the system as a whole, which affects the performance of the overall system, so the performance of the whole system is very low, although it was already three rounds of wine at that time , my mind is not so clear, but I still try my best to think of several ways for Brother Hua.

2.1 System expansion

The first solution that comes to mind is expansion. In the field of engineering technology, when the system performance is not up to standard, the first solution that comes to mind is generally expansion. Capacity expansion in the engineering field can generally be divided into vertical expansion and horizontal expansion. Method: Vertical expansion is to increase the processing capacity of a single instance by improving the hardware capability of a single instance, and horizontal expansion is to increase the processing capacity of the entire system by adding instance nodes.

Applying these two theories, let’s see how to improve the throughput of the dining table. It seems that vertical expansion can’t do much. You can’t upgrade the cooking spoon into a gold-plated spoon imported from Germany; There is no good way to expand vertically, but horizontal expansion seems to be able to do a lot of things, as long as a few more sets of cooking tables are added, so that the 2 cooking teams executed in parallel can become 4, or even 8, directly realized Multi-threading is concurrent, so that the overall throughput of the system can be doubled immediately. The effect is not only effective, but also immediate. So I drew a schematic diagram of horizontal expansion, as shown in the following figure:

However, the horizontal expansion plan was quickly rejected by Brother Hua. Although in the field of engineering technology, with the maturity of cloud native technology, application-level expansion and contraction are very mature solutions to improve system processing capabilities, but in Here, it is impossible for Brother Hua to set up another dining table, not to mention that the booth contracted by Brother Hua does not have such a large space to set up a second dining table. Almost impossible.

Although the problems in this world that can be solved with money are not called problems, the problem now is that Brother Hua has no money.

2.2 Single Execution Optimization

If the way to improve system concurrency fails, the way to improve system throughput is to shorten the processing and execution time of a single request, so that the number of requests processed by the system per unit time will increase, thereby improving system throughput. When it comes to the dining table, it becomes necessary to shorten the time for one person to order a meal, especially when the former manager of Hua Ge spends a lot of time in front of a single dish, how to optimize it?

Let's split each call, and the process of making vegetables in front of each dish can be simulated and understood as executing a piece of logic, so that the entire process of making vegetables can be disassembled into small code blocks. The total calling time is Determined by the sum of the execution time of these code blocks, from the perspective of engineering technology, it is to ensure that each piece of logic is completed within a predictable time, so each piece of logic can control the execution of each piece of code through a timeout judgment logic Time, here is an example of Baidu search. In order to enhance the diversity of returned results, Baidu launched the Aladdin architecture. After each query is analyzed by the star map model, it will be distributed to different vertical categories. Cards in its own business field, and then Aladdin’s root application aggregates the results returned by vertical categories and returns them to the user. The execution of certain vertical categories will be slower. For example, when a user searches for a drug, the health vertical Such applications will screen nearby o2o pharmacies based on the longitude and latitude of the searcher, and calculate the promotional discount price of the drug in the store. This calculation often takes a long time, so the root application will add a 380ms timeout judgment. For all The vertical applications are the same. When the content you return exceeds this time, the result will be discarded. Let this example let everyone understand that by increasing the timeout setting for each link, this can ensure that the overall process is in a controllable It is executed within the time frame, thus ensuring the consistency of user experience.

The timeout in the program is easy to increase, because the program has no emotions, but the scene of ordering meals is different. It is impossible to arrange a waiter behind each dish to count 123 behind the time, and push him forward after more than 5 seconds. This cannot be done. Well, the reason is that ordering dishes is subjective and active. He can stop as long as he wants before a dish. After thinking about this problem, I have an idea to deprive users of the right to stay independently and create a unified stay time. So I designed a timeout device for Brother Hua, which is to add a set of automatic conveyors on both sides of the dining table, similar to the conveyor belt in the airport to the terminal after the security check, so that when people are ordering food on both sides There is no need to walk around by yourself, and everyone stays in front of each dish for the same amount of time, so there will be no problem of one person staying in front of a certain dish for too long, and it also avoids the problem that the dining table is caused by the long time of someone in front of you. The problem of overall stagnation due to time lag improves the throughput of the dining table, and the increase of the conveyor belt has the advantage that it can drive very slowly or even stop when there are not many people, and the speed of the conveyor belt can be appropriately increased during the peak period. Control the time for each person to order food to ensure the throughput rate of the entire dining table.

Brother Hua was stunned for a long time after hearing my genius idea. After calculating the possibility, he felt that this method is really feasible. He just needs to wait until the National Day or May 1st holiday to start construction and add conveyor belts on both sides. Finally, he heard a Brother Hua, who had a feasible plan, was a little excited, and his cheeks were also slightly flushed.

2.3 Elimination of non-core processes

Seeing that Brother Hua accepted my plan, I immediately felt a lot of encouragement, so I continued to think about how to optimize this process. In the field of engineering technology, a process can still be done when it is subjected to a large amount of traffic. The thing is to simplify the process, keep only the core process links, which is often called the golden process, and remove the non-core business nodes from the main process, so that the streamlined main process can shorten the execution time to a certain extent, and the main process The logic of process execution is less, and the probability of error is also reduced. Taking the order calculation process of JD Retail as an example, the following things need to be done when the retail side settles:

In fact, there are still many non-main nodes to deal with in the settlement, such as deleting related settled items in the shopping cart, pre-occupied self-pickup cabinets, etc., but these are non-core processes and can be removed from the main process.

Going back to the dining table, what process is the golden process? Yes, it is those processes that are directly related to revenue, and those items that do not generate revenue, such as rice, steamed buns, soup and porridge, can be considered as non-main processes. It is removed from the main process, which continuously simplifies the main process of taking meals for everyone, and also saves the space of the restaurant. The remaining space can be used for several things. One is to put more staple foods or add some dishes. In this way, income can be increased. The second is to add some self-service cash register equipment. The previous two cash register counters can meet the demand when ordering meals was slow, but now the whole process is simplified, and the overall speed of ordering meals for everyone has increased. In this way Two cash registers will become a new bottleneck, especially when encountering students who have direct payment by scanning the code, the bottleneck will be more obvious. In this way, by increasing the number of cash registers, the concurrent processing capacity of the cash register is improved, ensuring The entire process of taking meals is smooth, avoiding the emergence of new performance bottlenecks, perfect!

Brother Hua was very satisfied after hearing this suggestion. He felt that the dining table was too small and there were not enough dishes. In this way, the space was more reasonably used for food that could bring in profits. It was exactly what Brother Hua wanted. Brother Hua Very happy to respect me.

2.4 Distributed cache

In addition, a very important magic weapon for the Internet to increase system throughput and shorten a single execution time is to use distributed caching technology. Distributed caching technology can greatly shorten the data acquisition time for many calls with system bottlenecks. To shorten the processing time, can the distributed caching technology also be used at the dining table here?

I thought about it. The free staple food and soup and porridge removed from the main process can use the principle of caching, especially the principle of CDN caching, and distribute the staple food and soup and porridge near the dining table of colleagues. , so that colleagues who eat can serve staple food and soup and porridge within the nearest range. On the surface, there is no increase in revenue, but in fact, one is that everyone’s dining experience is closer, and the second is that everyone eats more. The meal is convenient, and the time for eating will be reduced, thereby improving the utilization rate of the dining table. Ensure that the next colleague who gets a meal can find a seat quickly, and the experience will also be improved. I won't draw any pictures here, I believe everyone can understand.

Three. Postscript

Brother Hua lowered his head and fell into deep thought after listening to my optimization plan as a whole. After a long time, he raised his head and looked at me, his eyes were a little blurred. I was a little nervous, thinking that he wanted to systematically comment on my plan. Mine is, what time is it now, and I realized that Brother Hua just drank too much and fell asleep with his head down.

{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4090830/blog/8586941