如何提高MySql数据库子查询的效率

做django的项目,用python语言写查询数据库的语句,使用OuterRef, Subquery,Exists,F创建复杂的查询语句:

数据库模型:

from django.db import models


# 经纬度唯一表
class LngLat(models.Model):
    lng = models.DecimalField(max_digits=10, decimal_places=7, default=0.0)
    lat = models.DecimalField(max_digits=9, decimal_places=7, default=0.0)

    class Meta:
        unique_together = ("lng", "lat")

    def __str__(self):
        return str(self.lng) + "," + str(self.lat)


# 浙江新建站小区对象
class ZJenb(models.Model):
    enbname = models.CharField(max_length=100, default=None)
    enbid = models.IntegerField()


# 浙江新建站小区对象
class ZJcell(models.Model):
    # ECI (28 Bits) = eNB ID(20 Bits) + Cell ID(8 Bits)(逻辑小区id,范围0~255)
    eci = models.IntegerField(primary_key=True, default=None)
    cgi = models.CharField(max_length=100, default=None, unique=True)
    cellid = models.IntegerField(default=None)
    cellname = models.CharField(max_length=100, default=None)
    # 频段,如F1,F2,D1,D2,D3
    freq = models.CharField(max_length=100, default=None)
    pci = models.IntegerField()
    lnglat = models.ForeignKey(LngLat, on_delete=models.CASCADE, default=None)
    # 天线方位角
    azimuth = models.IntegerField()
    # 模3值
    m3 = models.IntegerField(null=True)
    # 邻区关系,相当于单向邻区,虽然可以反向查找,但不是双向邻区。
    # symmetrical=True表示对称,加了正向,反向自动添加
    adj = models.ManyToManyField("self", symmetrical=False, through='ZJadjacent', through_fields=('cellfrom', 'cellto'))
    enb = models.ForeignKey(ZJenb, on_delete=models.CASCADE, default=None, null=True)

    def __str__(self):
        return self.cellname


# 邻区多对多关系表
class ZJadjacent(models.Model):
    cellfrom = models.ForeignKey(ZJcell, on_delete=models.CASCADE, default=None, related_name='adj_from_cell')
    # 因为有两个ZJcell外键,必须对这两个外键命名为不同的名字,否则出现报错fields.E304
    cellto = models.ForeignKey(ZJcell, on_delete=models.CASCADE, default=None, related_name='adj_to_cell')
    # 两个邻区间的距离
    distance = models.IntegerField(default=None)
    # 站在源小区,看目标小区的方位角
    azi1 = models.IntegerField(default=None)
    # 站在目标小区,看源小区的方位角
    azi2 = models.IntegerField(default=None)
    # 是否存在反向邻区default=None, null=True
    # Like BooleanField with null=True.
    # Use that instead of this field as it’s likely to be deprecated in a future version of Django.
    has_inverse_adj = models.NullBooleanField()
    # 唯一标识,源小区的cgi&目的小区的cgi,如 460-00-325632-129&460-00-325632-130,避免数据库中出现重复数据
    # , null=False 这个参数加不加对数据库没有更改,因为False是默认值
    # (venv) C:\\Users\Administrator\PycharmProjects\cellsmap>py manage.py makemigrations
    # No changes detected
    source_target_cgi = models.BigIntegerField(primary_key=True, default=None)
    # models.CharField(max_length=100, default=None, unique=True)

    def __str__(self):
        return str(self.source_target_cgi)

 解释一下:

一共4张表,小区表ZJcell是主表,有两个外键(经纬度表LngLat,基站表ZJenb),还有一个多对多关系(邻区ZJadjacent),其中ZJcell有31965条数据,ZJadjacent有838118条数据。

需要提取ZJadjacent邻区关系中,只有单向邻区的数据或有双向邻区的数据,将其数据中预设为空的has_inverse_adj项分别设置为False或True。

根据django官网子查询的例子,写了一个子查询逻辑,用以判断每条邻区数据是否反向邻区:

from django.db.models import OuterRef, Subquery
from dbbackend.models import ZJcell, LngLat, ZJenb, ZJadjacent
from django.db.models import F
from django.db.models import Exists
inverseadj = ZJadjacent.objects.annotate(iadj=2**32*F('cellto_id') + F('cellfrom_id')).filter(iadj=OuterRef('source_target_cgi'))

symmetricaladj = ZJadjacent.objects.annotate(equaladj=Subquery(inverseadj.values('source_target_cgi')))

解释一下:

ZJadjacent原表中的source_target_cgi字段(列)表示(源小区eci乘以2的32次方+目标小区eci)的和,想要与此表中其他条目的(目标eci乘以2的32次方+源小区eci)的和进行比对,如果能找出相等的项,说明存在双向邻区,否则只存在单向邻区。

由于(目标eci乘以2的32次方+源小区eci)在原表中不存在,需要在比对前添加这一列,于是使用语句:

annotate(iadj=2**32*F('cellto_id') + F('cellfrom_id'))

等效sql语言:

SELECT 
    `dbbackend_zjadjacent`.`cellfrom_id`, 
    `dbbackend_zjadjacent`.`cellto_id`, 
    `dbbackend_zjadjacent`.`distance`, 
    `dbbackend_zjadjacent`.`azi1`, 
    `dbbackend_zjadjacent`.`azi2`, 
    `dbbackend_zjadjacent`.`has_inverse_adj`, 
    `dbbackend_zjadjacent`.`source_target_cgi`, 
    (SELECT U0.`cellfrom_id`, 
            U0.`cellto_id`, 
            U0.`distance`, 
            U0.`azi1`, U0.`azi2`, 
            U0.`has_inverse_adj`, 
            U0.`source_target_cgi`, 
            ((4294967296 * U0.`cellto_id`) + U0.`cellfrom_id`) AS `iadj` 
        FROM `dbbackend_zjadjacent` U0 
        WHERE 
            ((4294967296 * U0.`cellto_id`) + U0.`cellfrom_id`) = 
            (`dbbackend_zjadjacent`.`source_target_cgi`)
    ) AS `equaladj` 
FROM `dbbackend_zjadjacent`

实际查询时,发现效率很低,只是计数symmetricaladj.count()都很慢。

第一次改进,使用Exists()函数:

from django.db.models import OuterRef, Subquery
from dbbackend.models import ZJcell, LngLat, ZJenb, ZJadjacent
from django.db.models import F
from django.db.models import Exists
inverseadj = ZJadjacent.objects.annotate(iadj=2**32*F('cellto_id') + F('cellfrom_id')).filter(iadj=OuterRef('source_target_cgi'))


symmetricaladjsource = ZJadjacent.objects.annotate(equaladj=Exists(inverseadj))

等效sql:

SELECT 
    `dbbackend_zjadjacent`.`cellfrom_id`, 
    `dbbackend_zjadjacent`.`cellto_id`, 
    `dbbackend_zjadjacent`.`distance`, 
    `dbbackend_zjadjacent`.`azi1`, 
    `dbbackend_zjadjacent`.`azi2`, 
    `dbbackend_zjadjacent`.`has_inverse_adj`, 
    `dbbackend_zjadjacent`.`source_target_cgi`, 
    EXISTS(SELECT U0.`cellfrom_id`, 
            U0.`cellto_id`, 
            U0.`distance`, 
            U0.`azi1`, U0.`azi2`, 
            U0.`has_inverse_adj`, 
            U0.`source_target_cgi`, 
            ((4294967296 * U0.`cellto_id`) + U0.`cellfrom_id`) AS `iadj` 
        FROM `dbbackend_zjadjacent` U0 
        WHERE 
            ((4294967296 * U0.`cellto_id`) + U0.`cellfrom_id`) = 
            (`dbbackend_zjadjacent`.`source_target_cgi`)
    ) AS `equaladj` 
FROM `dbbackend_zjadjacent`

测试结果依然很慢,仔细分析sql语句,发现sql语句类似于循环,父查询时外部循环,子查询是内部循环,父查询每执行一次都会执行一遍子查询,整体相当于执行了 838118次子查询,所以子查询的语句需要尽量精简,否则严重影响效率。

第二次修改,将annotate(iadj=2**32*F('cellto_id') + F('cellfrom_id'))从子查询移出,放到父查询中:

inverseadj = ZJadjacent.objects.filter(source_target_cgi=OuterRef('iadj'))

symmetricaladjsource = ZJadjacent.objects.annotate(iadj=2**32*F('cellto_id') + F('cellfrom_id')).annotate(equaladj=Exists(inverseadj))

等效sql:

SELECT 
    `dbbackend_zjadjacent`.`cellfrom_id`, 
    `dbbackend_zjadjacent`.`cellto_id`, 
    `dbbackend_zjadjacent`.`distance`, 
    `dbbackend_zjadjacent`.`azi1`, 
    `dbbackend_zjadjacent`.`azi2`, 
    `dbbackend_zjadjacent`.`has_inverse_adj`, 
    `dbbackend_zjadjacent`.`source_target_cgi`, 
    ((4294967296 * `dbbackend_zjadjacent`.`cellto_id`) + 
    `dbbackend_zjadjacent`.`cellfrom_id`) AS `iadj`, 
    EXISTS(SELECT U0.`cellfrom_id`, 
            U0.`cellto_id`, 
            U0.`distance`, 
            U0.`azi1`, 
            U0.`azi2`, 
            U0.`has_inverse_adj`, 
            U0.`source_target_cgi` 
        FROM `dbbackend_zjadjacent` U0 
        WHERE 
            U0.`source_target_cgi` = 
            (((4294967296 * `dbbackend_zjadjacent`.`cellto_id`) + 
            `dbbackend_zjadjacent`.`cellfrom_id`))
    ) AS `equaladj` 
FROM `dbbackend_zjadjacent`

再次后,效率明显提升,执行count()函数只需等待n秒左右。

symmetricaladjsource
<QuerySet [<ZJadjacent: 358036724510097538>, <ZJadjacent: 358036724510097539>, <ZJadjacent: 358036724510099847>, <ZJadjacent: 358036724510129025>, <ZJadjacent: 358036724510131843>, <ZJadjacent: 358036724510413443>, <ZJadjacent: 358036724510439042>, <ZJadjacent: 358036724510448771>, <ZJadjacent: 358036724510570881>, <ZJadjacent: 358036724532486657>, <ZJadjacent: 358036724532486659>, <ZJadjacent: 358036724532487426>, <ZJadjacent: 358036724532487427>, <ZJadjacent: 358036724532488449>, <ZJadjacent: 358036724532488450>, <ZJadjacent: 358036724532488451>, <ZJadjacent: 358036724532489217>, <ZJadjacent: 358036724532489218>, <ZJadjacent: 358036724532494337>, <ZJadjacent: 358036724532494342>, '...(remaining elements truncated)...']>
symmetricaladjsource[:100]
<QuerySet [<ZJadjacent: 358036724510097538>, <ZJadjacent: 358036724510097539>, <ZJadjacent: 358036724510099847>, <ZJadjacent: 358036724510129025>, <ZJadjacent: 358036724510131843>, <ZJadjacent: 358036724510413443>, <ZJadjacent: 358036724510439042>, <ZJadjacent: 358036724510448771>, <ZJadjacent: 358036724510570881>, <ZJadjacent: 358036724532486657>, <ZJadjacent: 358036724532486659>, <ZJadjacent: 358036724532487426>, <ZJadjacent: 358036724532487427>, <ZJadjacent: 358036724532488449>, <ZJadjacent: 358036724532488450>, <ZJadjacent: 358036724532488451>, <ZJadjacent: 358036724532489217>, <ZJadjacent: 358036724532489218>, <ZJadjacent: 358036724532494337>, <ZJadjacent: 358036724532494342>, '...(remaining elements truncated)...']>
symmetricaladjsource[:1000]
<QuerySet [<ZJadjacent: 358036724510097538>, <ZJadjacent: 358036724510097539>, <ZJadjacent: 358036724510099847>, <ZJadjacent: 358036724510129025>, <ZJadjacent: 358036724510131843>, <ZJadjacent: 358036724510413443>, <ZJadjacent: 358036724510439042>, <ZJadjacent: 358036724510448771>, <ZJadjacent: 358036724510570881>, <ZJadjacent: 358036724532486657>, <ZJadjacent: 358036724532486659>, <ZJadjacent: 358036724532487426>, <ZJadjacent: 358036724532487427>, <ZJadjacent: 358036724532488449>, <ZJadjacent: 358036724532488450>, <ZJadjacent: 358036724532488451>, <ZJadjacent: 358036724532489217>, <ZJadjacent: 358036724532489218>, <ZJadjacent: 358036724532494337>, <ZJadjacent: 358036724532494342>, '...(remaining elements truncated)...']>
symmetricaladjsource.count()
838118
symmetricaladjsource.filter(equaladj=True)
<QuerySet [<ZJadjacent: 358036724510097538>, <ZJadjacent: 358036724510097539>, <ZJadjacent: 358036724510099847>, <ZJadjacent: 358036724510131843>, <ZJadjacent: 358036724510413443>, <ZJadjacent: 358036724510448771>, <ZJadjacent: 358036724510570881>, <ZJadjacent: 358036724532486657>, <ZJadjacent: 358036724532486659>, <ZJadjacent: 358036724532487426>, <ZJadjacent: 358036724532487427>, <ZJadjacent: 358036724532488449>, <ZJadjacent: 358036724532488450>, <ZJadjacent: 358036724532488451>, <ZJadjacent: 358036724532489217>, <ZJadjacent: 358036724532489218>, <ZJadjacent: 358036724532494337>, <ZJadjacent: 358036724532494342>, <ZJadjacent: 358036724532498690>, <ZJadjacent: 358036724532498691>, '...(remaining elements truncated)...']>
symmetricaladjsource.filter(equaladj=True).count()
711252
symmetricaladjsource.filter(equaladj=False).count()
126866

相关django官方手册链接:

Subquery() expressions

https://docs.djangoproject.com/en/2.1/ref/models/expressions/#subquery-expressions

Exists() subqueries

https://docs.djangoproject.com/en/2.1/ref/models/expressions/#exists-subqueries 

等效sql在pycharm中的位置:

猜你喜欢

转载自blog.csdn.net/qq_27361945/article/details/83503506