Practice and optimization of Apache Kylin

 

Since 2016, the Meituan Dadian catering technical team has started to use Apache Kylin as the OLAP engine, but with the rapid development of the business, efficiency problems have appeared at the construction and query levels. Therefore, the technical team started with the interpretation of the principle, then dismantled the process layer by layer, and worked out a point-to-face implementation route. This article summarizes some experience and experience, hoping to help more technical teams in the industry improve the efficiency of data output.

 

background

The sales business is characterized by large scale, multiple fields, and dense demand. The Meituan-to-store Catering DynaSky sales system (hereinafter referred to as "Qingtian"), as the main carrier of sales data support, not only covers a wide range, but also faces very complex technical scenarios (multi-organizational level data display and authentication, More than 1/3 of the indicators need to be accurately deduplicated, and the peak query has reached tens of thousands of levels). In this business context, building a stable and efficient OLAP engine to assist analysts in making quick decisions has become the core goal of Dangqingtian.

Apache Kylin is an open source OLAP engine based on the Hadoop big data platform. It uses multi-dimensional cube pre-computation technology and uses space for time to increase the query speed to sub-second level, which greatly improves the efficiency of data analysis. Brings convenient and flexible query functions. Based on the matching degree of technology and business, Kinte adopted Kylin as the OLAP engine in 2016. In the following years, this system effectively supported our data analysis system.

In 2020, Meituan's to-dining business will develop rapidly, and its data indicators will also increase rapidly. This system based on Kylin has serious efficiency problems in both construction and query, which affects the analysis and decision-making of data, and brings great obstacles to the optimization of user experience. After about half a year, the technical team carried out a series of optimization iterations on Kylin, including dimension tailoring, model design, resource adaptation, etc., to help increase the sales performance data SLA from 90% to 99.99%. Based on this actual combat, we have deposited a set of technical solutions covering "principle interpretation", "process disassembly", and "implementation route". It is hoped that these experiences and summaries can help more technical teams in the industry improve the efficiency of data output and business decision-making.

Problems and goals

Sales, as a bridge connecting platforms and merchants, includes two business models: sales-to-store and telephone visits. It is managed level by level with a war zone and a human organization structure. All analyses need to be viewed at two organizational levels. Under the requirements of consistent indicators and timely data output, we combined Kylin's pre-calculation ideas to design the data architecture. As shown below:

Kylin's formula for calculating dimension combinations is 2^N (N is the number of dimensions). The official method of dimension pruning is provided to reduce the number of dimension combinations. However, due to the particularity of the catering business, the number of single-task uncuttable combinations is still as high as 1000+. In the context of demand iterations and changes in manpower and theater organization, it is necessary to go back to all historical data, which will consume a lot of resources and ultra-high construction time. The architecture design based on business division can greatly ensure the decoupling of data output and ensure the consistency of the indicator caliber, but it puts a lot of pressure on the Kylin construction, which in turn leads to a large resource occupation and a long time consuming. Based on the above business status, we summarized the problems existing in Kylin's MOLAP mode, as follows:

  • Difficulty hitting efficiency problems (implementation principle) : There are many steps in the construction process, and the steps are strongly related. It is difficult to find the root cause of the problem only from the appearance of the problem, and it is impossible to solve the problem effectively.

  • The construction engine is not iterated (the construction process) : MapReduce is still used as the construction engine for historical tasks, and there is no switch to Spark, which is more efficient in construction.

  • Unreasonable resource utilization (construction process) : resource waste, resource waiting, and the default platform dynamic resource adaptation method, which causes small tasks to apply for a large amount of resources, unreasonable data segmentation, and a large number of small files, resulting in waste of resources and a large number of Task waiting.

  • The core task takes a long time (implementation route) : The source table of DynaSky's sales transaction performance data indicators has a large amount of data, a large number of dimensional combinations, and a high expansion rate, resulting in a daily construction time of more than 2 hours.

  • SLA quality is not up to standard (implementation route) : The overall achievement rate of SLA has not reached the expected goal.

After carefully analyzing the problem and determining the big goal of improving efficiency, we classified Kylin's construction process and disassembled the core links that can improve efficiency in the construction process. Through "principle interpretation" and "layers of dismantling" ", "from point to surface" means to achieve the goal of two-way reduction. The specific quantitative targets are shown in the figure below:

Optimization premise-principle interpretation

In order to solve the problem of difficult positioning and attribution for efficiency improvement, we interpreted Kylin's construction principle, including pre-computation ideas and By-layer layer-by-layer algorithm.

Precompute

Combine all possible dimensions according to the dimensions, pre-calculate the indicators that may be used in multi-dimensional analysis, and save the calculated results as Cubes. Suppose we have 4 dimensions. Each node in this Cube (called Cuboid) is a different combination of these 4 dimensions. Each combination defines a set of dimensions for analysis (such as group by), and the aggregation results of the indicators are saved On each Cuboid. When querying, we find the corresponding Cuboid according to SQL, read the value of the indicator, and then return. As shown below:

By-layer algorithm

An N-dimensional Cube consists of 1 N-dimensional sub-cube, N (N-1)-dimensional sub-cubes, N*(N-1)/2 (N-2)-dimensional sub-cubes, ... N 1 The subcube is composed of a 0-dimensional subcube, and there are 2^N subcubes in total. In the layer-by-layer algorithm, the number of dimensions is reduced layer by layer. The calculation of each layer (except for the first layer, which is aggregated from the original data) is calculated based on the calculation results of the previous layer.

For example: the result of group by [A,B] can be based on the result of group by [A,B,C], which can be aggregated by removing C, which can reduce repeated calculations. When the 0-dimensional Cuboid is calculated, The calculation of the entire Cube is completed. As shown below:

Process analysis-layer by layer dismantling

After understanding the underlying principles of Kylin, we locked the optimization direction in the five links of "engine selection", "data reading", "dictionary construction", "layered construction", and "file conversion", and then refined each After the problems, ideas and goals of the stage, we finally achieved a reduction in time consumption while reducing computing resources. The details are shown in the table below:

Build engine selection

Currently, we have gradually switched the build engine to Spark . Kinte has used Kylin as the OLAP engine as early as 2016. Historical tasks have not been switched, and only parameters have been optimized for MapReduce. In fact, in 2017, Kylin's official website has enabled Spark as the build engine (the official website enables the Spark build engine ), and the construction efficiency is 1 to 3 times higher than that of MapReduce. It can also be switched through the Cube design option, as shown in the following figure:

Read source data

Kylin reads the source data in Hive in the form of an external table. The data file in the table (stored in HDFS) is used as the input of the next subtask. This process may have small file problems. Currently, the number of files in Kylin's upstream data wide table is relatively reasonable, and there is no need to set up merging in the upstream. If it is forced to merge, it will increase the processing time of upstream source table data.

For project requirements, if you want to refresh the historical data or increase the dimension combination, you need to rebuild all the data. Usually, the monthly structure is used to refresh the history. There are too many small files in the loaded partition, which makes this process slow. Rewrite the configuration file at the Kylin level, merge small files, reduce the number of Maps, and effectively improve the reading efficiency.

Combine small files in the source table : Combine the number of small files in the Hive source table to control the number of tasks in parallel for each job. The adjustment parameters are shown in the following table:

Kylin level parameter rewriting : Set the file size of the Map reading process. The adjustment parameters are shown in the following table:

Build a dictionary

Kylin calculates the dimension values ​​that appear in the Hive table, creates a dimension dictionary, maps the dimension values ​​into codes, and saves and saves statistical information to save HBase storage resources. Each combination of dimensions is called a Cuboid. In theory, an N-dimensional Cube has 2^N dimensional combinations.

Combination quantity view

After pruning the dimension combination, the actual calculated dimension combination is difficult to calculate. You can view the specific number of dimension combinations through the execution log (the screenshot is the log of the last Reduce in the step of extracting the unique column of the fact table). As shown below:

Global dictionary dependency

DynaSky has many business scenarios that require precise deduplication. When there are multiple global dictionary columns, column dependencies can be set. For example, when there are data indicators of "number of stores" and "number of online stores" at the same time, column dependencies can be set to reduce Calculation of ultra-high base dimensions. As shown below:

Computing resource allocation

When there are multiple precise deduplication indicators in the indicators, computing resources can be appropriately increased to improve the efficiency of high-base dimension construction. The parameter settings are shown in the following table:

Layered construction

This process is the core of Kylin's construction. After switching the Spark engine, only the By-layer layer-by-layer algorithm is used by default, and no automatic selection (By-layer layer-by-layer algorithm, fast algorithm) is no longer used. In the process of implementing the By-layer layer-by-layer algorithm, Spark calculates layer by layer from the bottom Cuboid until it calculates the topmost Cuboid (equivalent to executing a query without group by). The result data is cached in the memory, skipping each data reading process, and directly relying on the cached data of the upper layer, which greatly improves the execution efficiency. The specific content of the Spark execution process is as follows.

Job stage

The number of jobs is the number of layers of the By-layer algorithm tree, and Spark outputs the result data of each layer as a job. As shown below:

Stage

Each job corresponds to two Stage stages, which are divided into reading the upper layer cache data and caching the result data after the calculation of the layer. As shown below:

Task parallelism setting

Kylin estimates the size of the Cuboid combination data for each layer (dimension pruning can be used to reduce the number of dimensional combinations, reduce the size of the Cuboid combination data, and improve the construction efficiency. This article will not describe in detail) and the parameter values ​​of the split data Calculate the task parallelism. Calculated as follows:

  • Calculation formula for the number of tasks: Min(MapSize/cut-mb, MaxPartition); Max(MapSize/cut-mb, MinPartition)

    • MapSize : Cuboid combination size constructed for each layer, namely: Kylin's estimate of the size of each layer dimension combination.

    • cut-mb : divide the data size, control the number of parallel Task tasks, which can be set by the kylin.engine.spark.rdd-partition-cut-mb parameter.

    • MaxPartition : The maximum partition, which can be set by the kylin.engine.spark.max-partition parameter.

    • MinPartition : The minimum partition, which can be set by the kylin.engine.spark.min-partition parameter.

  • Calculation of the number of output files : Each Task task compresses the result data after the execution is completed and writes it to HDFS as the input of the file conversion process. The number of files is the sum of the number of output files of the Task task.

Resource application calculation

The platform defaults to dynamically applying for computing resources. The computing power of a single Executor includes: 1 logical CPU (hereinafter referred to as CPU), 6GB on-heap memory, and 1GB off-heap memory. Calculated as follows:

  • CPU  = kylin.engine.spark-conf.spark.executor.cores * The number of Executors actually applied for.

  • Memory = (kylin.engine.spark-conf.spark.executor.memory + spark.yarn.executor.memoryOverhead) * Actual number of Executors applied for.

  • The execution capability of a single Executor = kylin.engine.spark-conf.spark.executor.memory / kylin.engine.spark-conf.spark.executor.cores, that is: the memory size requested during the execution of 1 CPU.

  • The maximum number of Executors = kylin.engine.spark-conf.spark.dynamicAllocation.maxExecutors, the platform defaults to dynamic applications, this parameter limits the maximum number of applications.

In the case of sufficient resources, if a single Stage Stage Application 1000 parallel tasks, you need to apply resources to achieve the 1000 7000GB memory and CPU, namely: CPU:1*1000=1000;内存:(6+1)*1000=7000GB.

Resource rationalization and adaptation

Due to the characteristics of the By-layer layer-by-layer algorithm and the compression mechanism of Spark in the actual execution process, the partition data loaded by the actually executed Task task is much smaller than the parameter setting value, which leads to ultra-high parallelism of tasks, taking up a lot of resources, and generating A large number of small files affect the downstream file conversion process. Therefore, reasonable segmentation of data becomes the key point of optimization. By constructing the log through Kylin, you can view the estimated size of the combined Cuboid data at each level and the number of partitions (equal to the number of tasks actually generated in the Stage stage). As shown below:

Combined with the Spark UI, you can view the actual execution status, adjust the memory application, and meet the resources required for execution, reducing resource waste.

1. The minimum value of the overall resource application is greater than the sum of the cached data at the Top1 and Top2 levels of the Stage, to ensure that all the cached data is in the memory. As shown below:

Calculation formula : the sum of the cached data of the Top1 and Top2 levels of the Stage stage <kylin.engine.spark-conf.spark.executor.memory * kylin.engine.spark-conf.spark.memory.fraction * spark.memory.storageFraction *max Number of Executors

2. The actual memory and CPU required by a single task (1 CPU is used for execution of a task) is less than the execution capacity of a single Executor. As shown below:

Calculation formula : the actual memory required by a single task <kylin.engine.spark-conf.spark.executor.memory * kylin.engine.spark-conf.spark.memory.fraction * spark.memory.st·orageFraction / kylin.engine .spark-conf.spark.executor.cores. The parameter description is shown in the following table:

File conversion

Kylin converts the built Cuboid file into an Hfile file in HTable format, and associates the file with the HTable through BulkLoad, which greatly reduces the load of HBase. This process is completed by a MapReduce task, and the number of maps is the number of output files in the hierarchical construction phase. The log is as follows:

At this stage, you can reasonably apply for computing resources based on the actual input data file size (viewable through MapReduce logs) to avoid resource waste.

Calculation formula: Map stage resource application = kylin.job.mr.config.override.mapreduce.map.memory.mb * The number of output files in the hierarchical construction stage. The specific parameters are shown in the following table:

Implementation route-from point to surface

Trading pilot practice

Through the interpretation of Kylin principles and the layer-by-layer disassembly of the construction process, we select the core tasks of sales transactions for pilot practice. As shown below:

Comparison of practical results

Practice optimization for the core tasks of sales transactions, compare the actual use of resources and execution time before and after adjustment, and finally achieve the goal of two-way reduction. As shown below:

Achievement display

Overall situation of resources

DynaSky currently has 20+ Kylin tasks. After half a year of continuous optimization and iteration, compared to the monthly average CU usage of Kylin resource queues and Pending task CU usage, resource consumption has been significantly reduced under the same tasks. As shown below:

SLA overall achievement rate

After overall optimization from point to surface, DynaSky reached 100% SLA achievement rate in June 2020. As shown below:

Outlook

Apache Kylin officially became the top project of the Apache Foundation in November 2015. It only took 13 months to go from open source to becoming a top-level Apache project, and it was also the first top-level project to be fully contributed by a Chinese team to Apache.

At present, Meituan adopts a relatively stable V2.0 version. After nearly 4 years of use and accumulation, the restaurant and catering technical team has accumulated a lot of experience in optimizing query performance and construction efficiency. This article mainly explains the construction process in Spark Resource adaptation method. It is worth mentioning that Kylin officially released the V3.1 version in July 2020, introducing Flink as the build engine, and unified Flink to build the core process, including the data reading phase, the dictionary building phase, the layered construction phase, and the file In the conversion phase, the above four parts accounted for more than 95% of the overall construction time. This version upgrade has also greatly improved Kylin's construction efficiency. For details, please refer to : Flink Cube Build Engine .

Looking back at the upgrade process of Kylin's build engine, from MapReduce to Spark , to today's Flink, the iteration of build tools has always been moving closer to better mainstream engines, and there are many active and outstanding code contributors in the Kylin community, who are also helping to expand Kylin's ecology, adding more new features, is very worth learning. Finally, the Meituan to-store catering technical team once again expressed their gratitude to the Apache Kylin project team.

About the Author

Yue Qing, joined Meituan in 2019 as an engineer in the restaurant's R&D center.

----------  END  ----------

Job Offers

Meituan to the store business group accommodation ticket data intelligence group sincerely recruits small partners to enhance business competitiveness from the aspects of supply, control, selection, sales, etc., 100,000-level QPS processing, billion-level data analysis, complete business closed loop, currently Massive HC, if you are interested, please send an email to [email protected], we will contact you as soon as possible.

Maybe you still want to watch

Meituan Distribution Data Governance Practice

|  Meituan OCTO trillion-level data center computing engine technology analysis

General Order: Practice of Data Security Platform Construction

Guess you like

Origin blog.csdn.net/MeituanTech/article/details/109831433