Introduction to the evolution of the real-time offline logarithmic solution for the race list | JD Cloud technical team

1. Background

The racing list is a ranking list based on JD.com’s real-time sales data provided by various purchasing and sales groups during the promotion period. It also responds to the traffic peak scene of the big promotion, and uses the list to leverage brands to increase resource investment in JD.com. The racing list is based on user configuration rules for real-time data calculation. The ranking of the list changes in real time during the promotion period. The relevant ranking data is widely disseminated on Weibo and Moments. The accuracy of relevant calculations and rankings is crucial.

The configuration rules of each list of the racing list will be different. In order to ensure the calculation of the list data is accurate, it is necessary to check the real-time ranking data of the list before the start of the big promotion. The main verification method is to take the real-time data of the previous day Ranking data, in addition, configure information according to the list rules, calculate relevant offline data, perform real-time offline data comparison, and verify data consistency.

A single list rule has 20+ different configuration items, each configuration is independent of each other, and data verification needs to be performed separately for each rule

2. Evolution process of logarithmic scheme

2.1. Pure labor - high cost and cannot be fully covered

The initial stage is purely manual logarithm, and the real-time and offline data corresponding to the racing list are respectively obtained for manual comparison

1) Real-time data: Read the list data interface regularly at 23:59 every day, and record the corresponding list data

2) Offline data : Manually write offline SQL scripts according to the list rules, and execute SQL through data query to obtain list ranking data

The entire operation process takes a long time. It takes 1 hour to write SQL, and 0.5 hours to execute a single SQL. In order to cover all rules, more than 100 rule configurations, SQL writing, and data verification need to be completed at a time. If the rules remain unchanged, it is estimated that it will take It takes 20 man-days to complete a complete test, and script writing requires an in-depth understanding of business rules, and requires a high level of SQL for testers.

2.2. Semi-automation - continuous consumption of manpower

The race list is mainly used during the big promotion period. In addition to the functional test coverage rules, data verification of the rules configured by the business party is also required before the big promotion to ensure the calculation accuracy of the user configuration rules. Taking 23 years 618 as an example, there are a total of 5000+ list rules, if you still use the purely manual data verification scheme, it will take 900+ days, which is completely unfeasible. Therefore, the semi-automatic logarithm scheme is realized. Compared with the manual logarithm scheme, it solves the problems of automatic generation of offline SQL and automatic acquisition of real-time data.

The specific plan is as follows:

1. Real-time data acquisition : Based on the list snapshot function, the daily snapshot data of the list is automatically recorded and written into the database.

2. Offline SQL generation and data calculation:

2.1. Rule configuration storage: through the list rule export function that comes with the system, the list rules are exported to excel, and then imported into the hive table; at the same time, other configuration data that the list rules depend on are also imported into hive

2.2. Regularly generate SQL: According to the list rule configuration information, use the case when method to generate corresponding SQL fragments for different situations, and finally manually combine the above SQL

2.3. Merge SQL to execute computing tasks : Merge SQL generated by multiple combinations into one, configure offline scheduling tasks, and calculate offline data of different lists through task execution

2.4. Push data to logarithmic MySQL : push the generated offline list data to MySQL for real-time data storage

3. Real-time offline data comparison : After pushing all the real-time and offline data into the database, directly query the database, compare the data, and highlight the data exceeding the threshold.

Through the above method, the semi-automatic real-time offline logarithm is completed, which solves the most labor-intensive SQL manual writing problem in manual logarithm. However, this solution still has the following problems:

1. SQL requires manual intervention: There are still many manual operations in the generation of SQL, and manual adjustments to the generated SQL are required in the middle
2. Changes in rules lead to SQL adjustments : Before the big promotion, users will continue to adjust the rules, which will lead to inconsistencies between the previously configured SQL and user rules, which will lead to logarithmic failure of the corresponding list. It is necessary to regenerate the corresponding SQL and configure scheduling tasks and redo the logarithmic operation.

During the period of 618 and Double 11 in 22 years, it was mainly used by R&D students to adjust relevant SQL and verify data, which required 3 developers for 3 weeks and consumed 45 man-days as a whole.

2.3, full automation - liberate manpower

In order to further liberate manpower consumption and upgrade logarithmic operations from semi-automated to fully automated , the following needs to be achieved

1. Automatically generate and execute SQL without manual intervention
2. SQL for execution is automatically adjusted daily according to changes in rules to ensure that SQL can be automatically and continuously updated

The complete automated logarithmic scheme is shown in the figure below:



Optimization point details:

1. Automatically update and store SQL every day : the list rules are changed from manual page export to automatically extract rule data to HIVE every day, and then automatically update the target SQL every day and store the SQL in the HIVE table

2. Automatically obtain and execute the target SQL : obtain the executed target SQL from HIVE and then execute the SQL (using some special methods of the hive command, obtain the SQL in advance and execute it)

#HiveTask增加run_shell_cmd_out函数只返回标准流的内容在标准客户端执行如下python脚本
from HiveTask import HiveTask
ht = HiveTask()
ht.run_shell_cmd_out(shellcmd='hive -e "select *  from table;"')

The solution was put into use during 618, 23, which coincided with the handover of the R&D team. The new team had no logarithmic experience, and other businesses were carried out simultaneously, so it was impossible to invest full manpower. Through the fully automated logarithm, the R&D manpower investment has been liberated, and the efficiency of preparing for the big promotion has been greatly improved. The manpower required is mainly for test students to perform maintenance processing on the scheduling tasks of the entire link.

Author: Jingdong Retail Wang Henglei, Qi Qi

Source: JD Cloud Developer Community

Clarification about MyBatis-Flex plagiarizing MyBatis-Plus Arc browser officially released 1.0, claiming to be a substitute for Chrome OpenAI officially launched Android version ChatGPT VS Code optimized name obfuscation compression, reduced built-in JS by 20%! LK-99: The first room temperature and pressure superconductor? Musk "purchased for zero yuan" and robbed the @x Twitter account. The Python Steering Committee plans to accept the PEP 703 proposal, making the global interpreter lock optional . The number of visits to the system's open source and free packet capture software Stack Overflow has dropped significantly, and Musk said it has been replaced by LLM
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4090830/blog/10091789