Enhancement and Practice of Flink SQL in Meituan Real-time Data Warehouse

Abstract: This article is compiled from the sharing of Dong Jianhui, Meituan Data System R&D Engineer & Meituan Data System R&D Engineer Zhang Bin, in the Flink Forward Asia 2022 platform construction session. The content of this article is mainly divided into five parts:

  1. Flink SQL in Meituan
  2. Fine-grained configuration of SQL jobs
  3. SQL job changes support recovery from state
  4. Capacity Building for SQL Correctness Troubleshooting
  5. future outlook

Click to view live replay and speech PPT

1. Flink SQL in Meituan

At present, Flink SQL has been accessed and used by 100+ business parties in Meituan, and the number of SQL jobs has reached 5000+, accounting for 35% of the entire Flink operations, with a year-on-year growth rate of 115%.

The rapid growth of SQL jobs has brought us many new problems and challenges, mainly including the following:

  • SQL jobs cannot modify StateTTL, concurrency, and other configurations in a fine-grained manner, resulting in waste of resources.
  • SQL modification logic cannot be restored from the previous state.
  • Data correctness issues in SQL jobs are difficult to troubleshoot.

These problems and how to solve them are described below.

2. SQL job fine-grained configuration

Currently, Flink does not support fine-grained settings such as TTL, partition relationship between operators, and concurrency. Especially for TTL. In DataStream jobs, users can customize the TTL duration of state retention according to their needs. However, the current TTL setting of Flink SQL jobs only supports job granularity, which will cause a certain degree of waste of resources. Let's look at two Concrete business examples.

In the first scenario, different operators have different retention times for state. For example, the logic of this job is to perform association aggregation after deduplication. The deduplication operator only needs to set a TTL of 1h, while the aggregation operator requires 1 whole day of data. The current solution can only be set to 1 day, resulting in resource waste.

In the second scenario, the business cycles of the left and right flows of the Join operator are inconsistent, and some non-public dimension tables need to use Regular Join to sense the rollback of dimension tables. For such scenarios, at present, data can only be separated from hot and cold, and associated with real-time streams and dimension tables respectively, and data deduplication is performed, which greatly increases the cost of development and operation and maintenance.

The difficulty of this problem is that the state used by SQL jobs is a black box for users, so our goal is to allow users to perceive and modify TTL with a low threshold.

The solution we finally adopted was to provide a set of external service graph-service, also called editable execution plan. Before going online, we use this service to statically analyze the topology map of user jobs, collect and display the TTL, and open it to users for editing. When the user modifies the TTL of an operator or a stream, the new TTL configuration is passed to the Flink engine as an engine parameter to enhance the execution plan. The two core processes involved here are collecting TTL and enhancing execution plan.

First of all, let's look at collecting TTL. Two issues need to be considered.

  • The first question is at which stage the TTL information is collected. Because Flink's TTL information is bound to the state, it can only be known when creating a specific state descriptor, and the Transformation layer cannot know the state of the job, so we finally decided to collect TTL information during the conversion process from ExecNode to Transformation.
  • The second question is how to identify TTL. We add the same identification as Transformation ID to ExecNode, and introduce a work stack to store the ExecNode being translated. Before calling the translateToPlanInternal method of each ExecNode, we get the auto-incremented ExecNode ID and insert it into the work stack. When the translation of ExecNode is completed, it is removed from the top of the stack, and then the mapping relationship from Transformation to ExecNode and then to TTL is established.

Next, let's look at how to enhance the execution plan. The above is a simulated job topology diagram. Through analysis, we can collect the TTL information of the Join operator and aggregation operator, as well as the Transformation ID and ExecNode ID. Then we open it to users for editing. Suppose The user sets the TTL of the Join operator from 1 day to 1 hour.

Then we pass the new TTL configuration into the engine's TableConfig. When the Join operator calls TranslateToPlanInternal, it reads the top of the work stack to obtain the ExecNode currently being translated during the process of creating the state and reads the TTL configuration, so as to obtain the ExecNode ID, and then reads the corresponding TTL configuration from TableConfig, overwriting default allocation.

The picture above is the job test effect of the first scene.

First of all, we don't use our ability, and the TTL of the job as a whole is set to 1 day by default. It can be found that the average hourly Container CPU usage rate rose to 107% during the peak period, and 67.3% even during the low peak period, and the operator experienced several backpressures during the process.

Afterwards, we use the fine-tuning TTL capability to set the de-duplication operator to 1 hour, and the TTL of other stateful operators is still reserved for 1 day. During the peak period, the average hourly Container CPU usage is only 14.8%. It was 7.54% during the period without any back pressure in the process. In addition, the Checkpoint size has also been reduced from 8.54G to 1.8G.

Next are additional capacity optimizations provided by editable execution plans. The first is the partition relationship. Too many upstream and downstream operator connections in a job will occupy a large network buffer memory, which will affect the normal start and stop of the job. Based on the ability to edit the execution plan, we can manually change the Rebalance side to Rescale.

For example, in the example above, the upstream operator on the left has 2000 concurrency, while the downstream sink operator has only 1000 concurrency. In this scenario, Flink SQL will generate the connection mode of Rebalance by default, requiring a total of 2000*1000, a total of 2 million logical connections.

Through the ability to edit the execution plan, after we manually set Rebalance to Rescale, it only needs 2000 connections, which greatly reduces the memory requirement of Network buffer.

In addition, we also provide the following three capabilities based on the editable execution plan:

  • Supports modifying operator concurrency individually and recovering from state.
  • Supports modifying the slotSharingGroup of operators individually.
  • Support for modifying ChainStrategy and restoring from state.

3. SQL job changes support recovery from state

At present, the state recovery mechanism of Flink SQL is relatively strict. In many scenarios, job changes cannot be restored from the original state, resulting in a lot of waste of resources and operation and maintenance costs. In response to this problem, we made a detailed scenario analysis of the problem domain of state transition.

Flink SQL job changes can be divided into two types of scenarios, version upgrade (Upgrade) and job upgrade (Migration) within the same version. Furthermore, we focus on Migration scenarios under the real-time data warehouse, which can be roughly divided into three types: Graph Migration, Operator Migration, and SavepointMigration.

  • Graph Migration: Job changes only occur at the topology level of the Pipleline, that is, the attributes of nodes and edges change. We can solve this kind of scenario by sharing the first working editable execution plan.
  • Operator Migration: Job changes only occur at the operator state level, and DAG does not change. Such scenarios include adding some aggregation indicators, associating new attributes, and so on.
  • SavepointMigration: The job changes at the DAG level and operator status at the same time. The corresponding scenario is to initialize the state for new tasks with offline data.

We can see that SavepointMigration is mainly a composite scenario of Graph Migration and Operator Migration. Therefore, this sharing mainly focuses on the Operator Migration scenario. Through the combination of our capabilities for Graph Migration and Operator Migration, we plan to improve SavepointMigration in subsequent work.

Combined with the current business needs of Meituan, the most urgent scenario is that users need to add attributes to the fact table association in the wide table processing scenario at the DWD layer, and the DWS layer needs to add aggregation indicators, that is, the Operator Migration scenario discussed above.

First, we define a data structure named KeyedStateMetadata to identify the metadata of each KeyedState. When creating an operator, we inject the state metadata information (KeyedStateMetatda) into the static context, and analyze the user job execution graph obtained when. In the subsequent state migration process, we combine state metadata and State-Process-API to create a new Savepoint to complete the state migration.

The figure above is an example of migrating aggregation operators. First, we collect the KeyedStateMetadata of the aggregation operator, then read the old savepoint, and use the State-Process-API to convert the state in it. Finally we dump the new state to a new Savepoint, and let the job resume from the new Savepoint.

Let's introduce the structure of KeyedStateMetadata. First, an operator may have one or more Keyed States, and each Keyed State will correspond to a KeyedStateMetadata. For each KeyedStateMetadata, the state metadata information we will store includes state name, state data type, TTL configuration, custom StateContext interface. Among them, StateContext is an interface design used to detect whether two state schemas are compatible.

A concrete example is shown on the right side of the figure above. First look at AppendOnlyTopNFunction, we will collect its state name data-state-with-append, its two data types and its TTL is 1 day, and finally we will create a context structure called RankStateContext for it, which is used in The state compatibility check checks whether its state schema is compatible with other states.

By adopting state metadata information and State-Process-API, we have solved the technical difficulties of state migration, but in order to define a clear boundary of state migration capabilities and avoid the failure of users to use state migration capabilities in unsupported scenarios, resulting in waste of resources , O&M costs, we provide the ability to analyze in advance.

The pre-analysis capability can be divided into the following three levels of verification:

  • The SQL layer uses AST for business logic compatibility checks.
  • Perform topology logic compatibility check based on editable execution plan.
  • State Schema compatibility check.

Let me add here why we need business logic compatibility verification, because the compatibility verification of the state schema is more based on the perspective of the underlying technical capabilities, and it itself does not have the characteristics of identifying business semantics, such as sum and max corresponding to the state The data types are the same, but the two are completely incompatible in terms of business semantics, so we have added a supplement to check the compatibility of business logic here to ensure that users can use and use the right state migration capability.

Then, based on the previous three-layer verification, we will have a total of four analysis results, and the corresponding technical semantics and business semantics are as follows:

  • COMPATIBLE_AS_IS means that the job can be restored directly from the old state, and the corresponding meaning is that there is no change in the old and new jobs.
  • COMPATIBLE_AFTER_RENAME means that the job can be restored from the old state after adjusting the Operator ID. It corresponds to business scenarios such as modifying operator concurrency or adjusting job chain logic.
  • COMPATIBLE_AFTER_MIGRATION means that the job cannot be restored directly from the old state, and must be restored from the new state after making a new state through state migration. The corresponding scenario is to add operator indicators or fields such as aggregation, deduplication, and Join, which is also the scenario that we focus on in this sharing.
  • INCOMPATIBLE means that the old and new states of the job are completely incompatible, and no new state can be made through migration. The corresponding scenarios are other SQL logic changes, such as the exchange index order, increase and decrease operators, and some scenarios that we may not yet support state migration. This is also one of the work directions that we need to improve on state migration in the future.

Next, let's introduce the inspection process of the pre-analysis in detail.

First, we parse the SQL of old and new jobs to get AST. Then it is necessary to ensure that the index business semantics of the new job is backward compatible with the old job, such as index order swapping, which is verified at this layer. If it is found to be incompatible, it will directly return the verification result of INCOMPATIBLE.

Then use graph-service to translate it into MTJsonPlan, and then check whether the number of operators is inconsistent and whether the job topology map has changed. If any of these two conditions fail, a test result of INCOMPATIBLE will be returned. If these two results pass, we will calculate the mapping Map of the new and old operators, and check whether each pair of new and old operators in the mapping Map has status or TTL, and whether the status can be passed The ability to migrate is restored. If any of the conditions are not met, the test result of INCOMPATIBLE will be returned.

When the above conditions are met, we will check whether its old and new operator states need to be migrated. If migration is required, we return a result of COMPATIBLE_AFTER_MIGRATION. And if the state of the old and new operators can be restored without migration, we will further check whether its Operate ID has changed. If there is a change, we return a check result of COMPATIBLE_AFTER_RENAME. If there is no change, we will think that the analysis result of this job is COMPATIBLE_AS_IS, that is, there is no change between this job and the old job.

The picture above is an example of our product. When choosing to make a new Savepoint migration, we will conduct a pre-analysis for verification. The above is the case where the verification results are inconsistent, because I have modified the order of the exchange indicators for the old and new jobs.

Finally, we summarize the work we have done for the problem domain of state transition. First of all, what problem did we encounter?

The problem we encountered was that the native state recovery capability provided by Flink SQL was weak and could not support job changes. In the Meituan real-time data warehouse scenario, when SQL jobs need to add aggregation indicators or de-duplicate associated fields, they cannot be restored from the original state, causing many difficulties for users' job iterations.

In response to this problem, we first analyzed the problem domain of state migration in detail, subdivided the scenarios, and focused on the Migration scenario based on the current situation of Meituan, and supported scenarios such as aggregation and addition of indicators, fact table association, deduplication, and sorting. Added state transition capability for fields.

On this basis, we provide pre-analysis capabilities for the production environment to ensure that users will use and use the state migration capabilities correctly, and avoid meaningless waste of resources and operation and maintenance costs.

4. SQL correctness troubleshooting capacity building

Meituan is vigorously promoting the SQLization of Flink operations. We have encountered three types of problems in the operation and maintenance of Flink SQL operations, namely the loss of data, out-of-sequence problems, and correctness problems caused by improper use of FlinkSQL. Due to the lack of auxiliary tools, it is impossible to quickly locate the problem, which affects the normal data production of online business and hinders the SQLization process of Flink operations.

For business students, how to verify the correctness of Flink SQL jobs? There are three ways:

  • Comparison of Flink SQL jobs with existing self-developed system results.
  • Compare real-time job results with offline job results.
  • Comparison of the results of dual running of Flink SQL jobs on the active and standby links.

Through the above three methods, when the business finds that the Flink SQL job has a correctness problem, it faces the following three pain points.

  • The screening threshold is very high. For business students, they do not understand the underlying principles of Flink SQL, and for platform students, they do not understand user business. After the correctness problem occurs, there is no way to start.
  • The troubleshooting and location cycle is long. Since there are no tools available, it takes several days or even longer to locate the problem.
  • Seriously affecting the normal output of online business data, users have to re-migrate SQL jobs back to the original jobs, which greatly hinders the process of SQLizing Flink jobs.

In order to solve the pain points of users, we need an auxiliary system. Since Flink SQL jobs only have final results, there is no intermediate recording process. Therefore, if you can record the flow process of each piece of data in each operator of Flink SQL, it will be very helpful for troubleshooting and positioning problems. Just like monitoring equipment, you can troubleshoot and locate problems by playing back surveillance videos. Based on this idea, We have developed an auxiliary system.

Before developing this system, we did a simple research on related products in the industry. It is found that there is a mature Trace system for troubleshooting in distributed scenarios, which is very similar to our positioning and troubleshooting of Flink SQL correctness problems.

Next, let's briefly understand the principle of the Trace system. As shown in the upper left figure, through a complete rpc call, through the visits of five microservices A, B, C, D, and E, all 16 records of service dependent calls are saved. The Trace system will mark each complete link with a unique Trace ID. A complete call link can be associated with the Trace ID tag.

The lower left corner of the figure above shows three important links required by a complete Trace system, which are data buried point reporting, data collection and analysis, and data result display. These are also the three links required by the Flink SQL troubleshooting tool.

After a brief understanding of the Trace system, let's compare the capabilities required by the Trace system with the Flink SQL troubleshooting tool.

  • An rpc call of the Trace system has global associativity, but for Flink SQL, only local associativity between the same Task can be achieved.
  • Data reporting in the Trace system requires the business to manually bury points in key methods, but for Flink SQL, manual burying is extremely expensive. We expect to decouple from the Flink engine to facilitate future Flink version upgrades and maintenance.
  • The amount of data in the Trace system is large, and part of the data is allowed to be lost. However, the Flink SQL troubleshooting tool does not allow data to be lost. It is hoped that it can support the printing of the input and output of some operators.
  • The data in the Trace system has a global correlation and can be automatically attributed. In Flink, the data has no global correlation and can only be analyzed manually but cannot be automatically attributed.

Through comparison, it is found that the Trace system is not suitable for Flink SQL correctness troubleshooting, and needs to be customized and developed based on the above content.

Before explaining the Flink SQL troubleshooting system, let's briefly review the knowledge points related to Flink.

The first is to find out about Flink SQL operators. There are more than 30 operators involved in Flink SQL. Due to the space, I only list some operators here. There are many operators and some operators are realized through codegen code generation technology. Obviously, we need to bury points in the Flink SQL operator, and the development cost is very high.

We noticed that these operators have a common feature that they all inherit from AbstractStreamOperator, and there are key methods for record and watermark processing in this class, such as setKeyContextElement1/2 and processWatermark1/2.

This part is how the data flows to the Operator after the Task starts. Obtain data through MailboxProcessor loop calls and finally pass the data to OperatorChain. OperatorChain hands the data to the first Operator for processing, that is, the mainoperator. Here we start to call the methods of setKeyContextElement and processWatermark1 introduced above. Next, let's look at these How several methods are called in OperatorChain.

Through the above flow chart, we found that before the data is processed, it must go through the setKeyContextElement1/2 method. When the data flow is transferred to the next OperatorChain, the pushToRecordWriter method must be called, and the watermark processing is similar. Therefore, by monitoring these key methods, the input and output of the operator can be controlled.

It is not enough to have the above key methods, but also need to solve the problem of data analysis. Here you can see that after SQL is converted into StreamGraph, each StreamNode in StreamGraph records the input and output type information of the operator. For Flink SQL, the data transfer type is serialized rowdata. With the data type information, can the data transmitted between operators be parsed normally? The answer is yes, I will leave a question here first, and I will focus on this part of the overall structure later.

Through the above technical analysis, we are more determined to use bytecode enhancement technology to separate the data parsing process from the Flink engine, so as to facilitate the maintenance and upgrade of the subsequent Flink version. Next, we will introduce the overall architecture and implementation details in detail.

Next, look at the overall structure, which consists of five parts.

  • At the platform entrance, the user enables the Flink SQL Debug function to debug the job on the platform, selects the operator ID to output data, and then submits the job.
  • When TM starts, the data monitoring program listens to the key methods of Flink SQL to analyze the input and output data of the operator. The small gear in the figure represents the javaagent program that analyzes the data.
  • Synchronously send the parsed data to Kafka.
  • Synchronize the data in Kafka to the OLAP engine through tools.
  • Finally, query and analyze the data in the OLAP engine to troubleshoot and locate problems.

The first is the platform entrance. As you can see from the left side of the figure above, you need to turn on the detail switch of the input operator granularity. In addition, you need to choose which operators to print its input and output data on, and submit the job after you have this part.

After the job is submitted to Yarn, how to monitor operator data? We use the Byte Buddy bytecode enhancement framework to implement the analysis of Flink operator data, and achieve the purpose of decoupling from the Flink engine by monitoring the above key methods. The figure on the right is the output content of the data analysis program. The following is a detailed introduction to Value and input_order.

In the previous introduction, we left a question, the input and output type information is saved in the streamnode, how to parse the data? For Flink SQL, the serialized Rowdata is transmitted between operators, and the data can be parsed by calling the getField method by passing the type and field index parameters through a fixed method.

Here there is only parsed data, only values ​​without field information, why should it be managed with field information? Because there may be multiple identical values ​​in the parsed data, in order to accurately retrieve the results, it is necessary to map the fields to the values. How to get field information and associate data with field information? In the process of converting SQL into Transformation, there is an important TranslateToPlan method in the ExecNode class, which needs to be enhanced to parse the input and output fields of the operator and save them in StreamConfig, and associate the fields with data in the parsing program Woke up.

There is a difficulty here, which is OK for ordinary Flink SQL, but for the data transfer between Flink SQL operators in the data lake scenario, there are not only serialized RowData, but also Kryo type data, such as HoodieRecord, the analysis of HoodieRecord Need to rely on Hudi Schema information. This cannot be obtained in the parsing program. There is a simple and ingenious way, which is to call the toString method of HoodieRecord, and let Hudi parse it by itself in the toString method, which can efficiently and flexibly solve the problem of data parsing in the data lake scenario up.

After introducing data parsing, the next thing is to introduce the issue of data relevance in Task. We record the input sequence number of a sub_task granularity record in the field through input_order, which is used to mark the relevance of data in a Subtask.

The reason why this field is designed is that the fields of operators that are chained together may be different. For example, there are five Operators that are chained together. The first two Operators have ID fields, and the last three Operators do not have ID fields. If you query based on ID , the latter three Operators have no data. In order to show a complete link of data in Subtask, specifying the same Subtask and the same input_order can filter out the complete data link. The later Case analysis also introduces input_order.

There are three cases for data associativity:

  • For data transmission before synchronizing operators, the next piece of data can only be processed after the data has been processed by all Subtask operators. When the data enters the first Operator of the Subtask, input_order +1, and the following Operators use the input_order of the first Operator, and the input_order of these Operators is either the same or partially empty.
  • For operators with the mini-batch function enabled, the operators will collect batches for post-processing, and this batch of data is also related.
  • For LookupJoin, the synchronous case is the same as the first case, and the asynchronous case has the same field between operators, which can be used to associate the operator relationship.

The above content is how to parse the data, and then how to output the data to Kafka. In order to avoid redundant data output, by default, only the input data of all operators except Source and the output data of tailOperator in all Tasks except Sink are printed (when the downstream operator is out of order, it can be traced back to the upstream operator). Sub-data case), the user can also choose to output only part of the operator's data.

After having the input and output specifications of the operator, the following describes how the data is collected into Kafka. The first solution is to output the data to the log file and output it to Kafka through log collection. It is found through the test report that the writing speed can reach 600-800Mb/s at the peak. Currently, the TM log is saved by rolling. If the input speed is much higher than the collection speed, there is a risk of data loss. So the method of collecting logs does not meet the requirements.

Therefore, we adopted the second solution, directly synchronously outputting operator data to Kafka through sockets. If the written data is fast and the collection is slow, it will generate back pressure on the upstream operator and limit the upstream operator’s Processing the rate of sending data can not only ensure that the SQL of the business can be executed smoothly, but also ensure that the data output by the operator is not lost and that complete data is collected into Kafka.

The following introduces the three types of cases in which we use the Flink SQL troubleshooting tool to help businesses solve problems.

Case 1: Correctness problems caused by Flink SQL's own bugs. As shown in the figure above, the user's SQL is simply a deduplicated SQL job.

Phenomenon: There is a phenomenon of missing numbers, and the IDs of missing numbers are not fixed and irregular, but users can provide the IDs of missing numbers.

Conclusion: Through investigation, it is found that it is actually caused by a Flink bug, that is, the localtimestamp function has a precision bug. When using to_date(cast(localtimestamp as varchar),'yyyy-MM-hh HH:mm:ss.SSS'), When the time is the whole second (2022-05-01 12:10:10.000), the parsing of the to_date function fails and the condition is not met, resulting in data loss, but the Flink SQL job runs normally.

Through the Flink SQL troubleshooting tool, specify the ID, enter the following query SQL, and find that Calc only has input data, but no output data. It is determined that the value of the ID is filtered out in the Calc operator. After comparing the calc codegen code logic and Flink code, it was found that there was a bug at the whole point, resulting in data loss. It is very simple to solve this bug. It is only necessary to implement the logic of the precision processing of the whole point in Flink.

In the above query, Calc has an ID field. When the ID field does not exist, it is necessary to use input_order to associate a data link in the entire Subtask at this time. As shown in the SQL in the lower left corner of the figure above, specify subtask id operator id and input_order to query the complete data chain of this data.

Case2: Correctness problems (out of order) caused by Flink SQL design flaws. The difference from Flink SQL's own bugs is that Flink SQL's own bugs mean that Flink SQL can solve certain types of problems, but there are bugs, and Flink SQL's design flaws mean that Flink SQL cannot solve certain types of problems.

Phenomenon: The results of user SQL jobs are out of order, resulting in incorrect results.

Conclusion: Flink SQL Join has a one-to-many relationship between the left and right streams, the right stream uses NoUniqueKey, and NoUniqueKey uses MapState, and MapState cannot guarantee the data order, so the results of such queries will be out of order. In addition to such problems, if there are multiple Keybys in Flink SQL and the Key fields are inconsistent, it will also cause out-of-order problems.

Case3: Correctness problems caused by improper use of Flink SQL. This type of Case is also very common.

Phenomenon: The number of user SQL jobs is lost, resulting in incorrect results.

Conclusion: After checking the tool, it is found that the user has set the State TTL to be 2 hours, but actually there is more than 2 hours of data coming, the state expires, the data cannot be associated, and the data is lost, resulting in incorrect results. In addition to the wrong State TTL setting, there are also usage problems such as business logic and SQL expression.

After using this tool, sometimes it can be proved that there is no problem with the Flink SQL job, but there is a problem with the comparison job. With this tool, the troubleshooting time has been reduced from days to hours or even minutes, which greatly shortens the troubleshooting time. It has won the recognition and trust of users, and escorts the SQLization process of Flink operations.

5. Future Outlook

The future outlook is mainly divided into the following three parts:

  • Flink SQL fine-grained configuration

    • In terms of fine-grained resource management, currently fine-grained resource management only supports API settings, so it is also necessary to support fine-grained resource management through the flexible configuration function of Flink SQL in SQL scenarios.
    • The flexible configuration of Flink SQL is used in conjunction with the Flink autopilot mechanism, so that SQL jobs can be automatically adjusted to an ideal state.
  • Flink SQL State

    • It is hoped that Flink SQL State has the ability to be queried.
    • Supports lazy migration to recover from state after exploring SQL changes.
  • Flink SQL troubleshooting tool

    • It is hoped that based on accumulated experience, Flink SQL supports risk warnings before going online.
    • Addresses known out-of-order and performance issues found.

Click to view live replay and speech PPT

Guess you like

Origin blog.csdn.net/weixin_44904816/article/details/129943073