Automatically discover 100+ vulnerabilities in four major databases in one day, Zhejiang University research won the best paper of SIGMOD 2023

Heart of the Machine Column

Heart of the Machine Editorial Department

In this paper, researchers from Zhejiang University proposed a method called Transformed Query Synthesis (TQS). After running for 24 hours, TQS successfully found 115 vulnerabilities, including 31 in MySQL, 30 in MariaDB, 31 in TiDB, and 23 in PolarDB.

The 2023 ACM SIGMOD/PODS International Conference on Data Management (SIGMOD 2023) will be held in Seattle, USA on June 18-23, local time. Recently, the conference announced the list of the best papers, and Microsoft Research's "Predicate Pushdown for Data Science Pipelines" and Zhejiang University's "Detecting Logic Bugs of Join Optimizations in DBMS" were awarded. Since the conference started in 1975, this is the first time that a research team from mainland China has won the best paper award of the conference. Among them, Zhejiang University's research proposed a novel method that can automatically find logical loopholes in database management systems such as MySQL, MariaDB, TiDB, and PolarDB.

picture

Over the past few decades, modern database management systems (DBMS) have evolved to support many different new architectures, such as cloud platforms and HTAP, which require increasingly sophisticated optimizations for query evaluation. Query optimizer (query optimizer) is considered to be one of the most complex and important components in DBMS, its function is to parse the input SQL query, and then generate efficient execution plan with the assistance of built-in cost model. Errors in the query optimizer implementation can lead to bugs, including crash bugs and logic bugs. Crash vulnerabilities are easy to detect because a crash causes the system to halt immediately. Logical holes, however, are easy to overlook because logical holes can cause the DBMS to return erroneous result sets that are difficult to detect. This paper focuses on detecting these silent vulnerabilities.

An emerging approach to detecting logical holes in DBMSs is Pivoted Query Synthesis (PQS). The core idea of ​​this method is to randomly select a pivot row (pivot row) from the table, and then generate a query with this row as the result. If any query synthesized fails to return the row of data, then a logic hole has been detected. PQS is mainly used to support option queries in a single table, and 90% of the reported vulnerabilities only involve single-table queries. There is a large research gap for multi-table queries (which are more error-prone than single-table queries) using different join algorithms and join structures.

The figure below shows two logical loopholes in the join query in MySQL. Both vulnerabilities can be detected by using the new tools proposed in this paper.

picture

Figure 1: Example of a logical vulnerability for connection optimization in a DBMS

Figure 1 (a) shows a logic hole in the hash join in MySQL 8.0.18. In this example, the first query returned the correct result set because it was executed using a block nested loop join. However, the second query, using the inner hash join, had a problem and returned an incorrectly empty result set. This is because the underlying hash join algorithm incorrectly assumes that 0 is not equal to −0.

The logic hole in Figure 1 (b) stems from the semi-join processing in MySQL 8.0.28. In the first query, the nested loop inner join will convert the data type varchar to bigint, and then get the correct result set. However, when the second query is executed using the hash semi-join, the data type varchar will be converted to double, resulting in loss of data accuracy and errors in equivalence comparison.
It is far more difficult to use the query synthesis method for the logic vulnerability detection problem of multi-table join query than for single-table query, which involves two challenges:

  • 结果验证:为了验证查询结果的正确性,之前的方法采用的是差分测试策略。其思路是使用不同的物理执行计划(physical plan,即数据库系统实际执行查询语句的方式)来处理查询。如果这些规划返回的结果集不一致,那么就可能是检测到了逻辑漏洞。但是,差分测试方法有两个缺点。其一,某些逻辑漏洞可影响多个物理执行计划并让它们全部生成同样的错误结果。其二,当观察到不一致的结果集时,需要人工检查生成正确结果的是哪一个执行计划,从而导致成本开销变得高昂。这个问题有一个可能的解决方案,即为任意测试查询构建真值(ground-truth)结果,但现有的工具并不支持这种操作;

  • 搜索空间:对于给定的数据库模式,可生成的连接查询的数量随表格和列的数量呈指数级变化。由于我们不可能为了验证而枚举出所有可能的查询,因此就需要一种有效的查询空间探索机制,以便让我们尽可能高效地检测出逻辑漏洞。

针对以上难题,浙大的研究者提出了一种名为 Transformed Query Synthesis(TQS)的方法。在检测 DBMS 中连接优化的逻辑漏洞任务上,TQS 是一种普适且成本高效的全新工具。

针对上述第一个挑战,研究者提出的应对方法是 DSG,即数据驱动的模式和查询生成(Data-guided Schema and query Generation) 。给定表示为一个宽表数据集,DSG 可基于检测到的范式将该数据集拆分为多个表格。为了加快发现漏洞的速度,DSG 还会向生成的数据库中注入一些人工噪声数据。首先,将该数据库模式转换成一个图(graph),其中节点是表 / 列,边是节点之间的关系。DSG 会在模式图上使用随机游走来为查询选择表格,然后再使用这些表格来生成连接(join)。对于涉及多表的特定连接查询,我们可以轻松从宽表格中找到其真值结果。这样一来,DSG 就能有效地为数据库验证生成 (查询,结果) 集合了。

针对上述第二个挑战,研究者设计的方法是 KQE,即知识引导的查询空间探索(Knowledge-guided Query space Exploration) 。该方法首先是将模式图扩展成一个规划迭代图(plan-iterative graph),其表示整个查询生成空间。然后将每个连接查询表示为一个子图。为了给生成的查询图评分,KQE 采用了一种基于嵌入的图索引,其可以在已经探索过的空间中搜索是否有结构相似的查询图。根据覆盖度分数引导随机游走查询生成器,以尽可能多地探索未知的查询空间。

为了展现该方法的通用性和有效性,研究者在四个常用 DBMS 上对 TQS 进行了评估:MySQL、MariaDB、TiDB 和 PolarDB。运行了 24 小时后,TQS 成功找到了 115 个漏洞,包括 MySQL 中 31 个、MariaDB 中 30 个、TiDB 中 31 个、PolarDB 中 23 个。通过分析根本原因,可归纳出这些漏洞的类型,其中 MySQL 中的漏洞有 7 种、MariaDB 有 5 种、TiDB 有 5 种、PolarDB 有 3 种。研究者已经将发现的漏洞提交给相应的社区并且收到了积极的反馈。

下面将通过数学形式描述所要解决的问题以及浙大提出的解决方案。

问题定义

数据库的漏洞有两种:崩溃和逻辑漏洞。崩溃漏洞来自于操作系统和 DBMS 的执行过程。它们会导致 DBMS 被强行终止,原因包括内存等资源不足或访问了无效内存地址等。因此,崩溃漏洞很容易被发现。相较而言,逻辑漏洞则更难以发现,因为数据库依然会正常运行,处理查询后也会返回看似正确的结果(并且大多数情况下它们确实会返回正确结果,但在少数情况下却可能读取错误的结果集)。这些无声漏洞就像是隐形炸弹,要更加危险一些,因为它们难以检测到,还可能影响到应用的正确性。

这篇论文为多表连接查询问题引入了查询优化器来检测逻辑漏洞。研究者将这些漏洞称为连接优化漏洞(join optimization bugs)。使用表 1 给出的标记法,连接优化漏洞检测问题可以形式化地定义为:
定义:对于查询工作负载picture中的每个查询picture,令查询优化器通过多个实际规划执行picture 的连接,并使用基本真值 picture 验证其结果集picture。如果picture,则发现了一个连接优化漏洞。

picture表 1:符号说明表

方案概述

图 2 给出了 TQS 的架构概况。给定一个基准数据集和目标 DBMS,TQS 通过基于数据集生成查询来搜索 DBMS 可能存在的逻辑漏洞。TQS 有两大关键组件:数据引导的模式和查询生成(DSG)和知识引导的查询空间探索(KQE)

picture

图 2:TQS 概况**

DSG 将输入数据集视为一个宽表,并且除了原始元组外,DSG 还会刻意合成一些有易错值(比如空值或非常长的字符串)的元组。针对连接查询,DSG 会为该宽表创建一个新模式,其方法是将该宽表分成多个表,确保这些表符合基于功能依赖性的范式。DSG 会将该数据库模式建模成一个图,然后在该模式图上通过随机游走来生成逻辑 / 概念查询。DSG 会将逻辑查询具体化为物理执行计划,并通过不同的提示对该查询进行变换,使 DBMS 能够执行多个不同的物理执行计划,以搜索漏洞。对于一个连接查询,其基本真值结果是通过将连接图映射回宽表而得到。

在完成模式设置和数据拆分之后,KQE 将该模式图扩展为一个规划迭代图。每个查询都表示为一个子图。KQE 为历史中的查询图(即在已探索过的查询空间中)的嵌入构建一个基于嵌入的图索引。直观地说,KQE 的作用是确保新生成的查询图尽可能地远离其在历史中的最近邻,即这是为了探索新的查询图,而不是重复已有的查询图。为此,KQE 通过基于结构相似性(与历史中的查询图)为生成的查询图评分,同时使用自适应随机游走方法来生成查询。

算法 1 总结了 TQS 的核心思想,其中第 2、10、12 行是 DSG,第 4、8、9 行是 KQE。

picture

给定一个数据集picture和从picture 采样得到的宽表picture,DSG 将单个宽表picture 拆分成多表,这些表格组成符合 3NF 的数据库模式picture(第 2 行)。模式picture可以被视为一个图picture,其中表格和列是顶点,边代表的是顶点之间的关系。DSG 在 picture上使用随机游走来生成查询的连接表达(第 10 行)。事实上,连接查询可以被投射为picture 的子图。通过将子图映射回宽表格picture,DSG 可轻松地检索到该查询的基本真值结果(第 12 行)。

KQE 将模式图扩展为一个规划迭代图(第 4 行)。为避免测试相似的路径,KQE 会构建一个基于嵌入的图索引picture来索引已有查询图的嵌入(第 9 行)。KQE 根据当前查询图与已有查询图的结构相似性来更新规划迭代图 G 的边权重 π (第 8 行)。KQE 为下一条可能路径评分,其引导着随机游走生成器,从而更倾向于探索未知的查询空间。

For a query  , TQS transforms the query picturethrough hint sets to execute several different actual query plans (line 11). pictureFinally, the query picture 's result set is compared to the ground truth picture (line 14). If they do not match, then a connection optimization vulnerability has been detected (line 15).

For a more detailed description of DSG and KQE please read the original paper.

Experimental results
TQS successfully found some logical vulnerabilities in database management systems such as MySQL, MariaDB, TiDB, and PolarDB. They are divided into 20 types, including 7 types of MySQL vulnerabilities, 5 types of MariaDB vulnerabilities, 5 types of TiDB vulnerabilities, and 5 types of PolarDB vulnerabilities. There are 3 types, as shown in the table below.

picture

Compared with other methods, the overall performance of TQS proposed by Zhejiang University is also quite impressive, and it has achieved significantly better results in many indicators, and the effectiveness of each component has also been tested by controlling variable experiments.

picture

But the researchers also said that TQS currently focuses on equivalence join queries. Nevertheless, the ideas of DSG and KQE can also be extended to the case of non-equivalent joins. The only challenge is how to generate and manage the query ground-truth results - which would grow exponentially in size in the case of non-equijoins. This aspect remains to be further studied in the future.

Guess you like

Origin juejin.im/post/7234146188508332092