MySQL循环删除历史数据

业务场景

工作当中可能会遇到对实时采集的数据,进行每日统计,这些数据在统计完成之后,就不是很重要了。如果这些历史数据一直保存,将导致数据量剧增,影响统计效率,所以一般的做法是,周期性清除数据,只保留一段时间内的数据。

解决方案

基于MySQL分区表实现只保留近一周数据,循环删除一周前的数据,具体方案如下:
1)表结构设计

DROP TABLE IF EXISTS `test_region`;
CREATE TABLE `test_region` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `create_date` date DEFAULT NULL,
  KEY `id` (`id`)
) 
PARTITION BY LIST (dayofweek(create_date))
(
 PARTITION P1 VALUES IN (1),
 PARTITION P2 VALUES IN (2),
 PARTITION P3 VALUES IN (3),
 PARTITION P4 VALUES IN (4),
 PARTITION P5 VALUES IN (5),
 PARTITION P6 VALUES IN (6),
 PARTITION P7 VALUES IN (7)
 );

2)代码实现

#!/usr/bin/python
# -*- coding: UTF-8 -*-

from pymysql import connect
import datetime
import time

conn = connect("10.10.2.110", "root", "root", "test")
cursor = conn.cursor()

currdate = datetime.datetime.now() # 当前日期
for i in range(100): # 循环执行100次
		print currdate.strftime('%Y-%m-%d')
        pos = int(currdate.strftime("%w")) + 1 # 计算当前日期在一周内的第几天
        sql = u"alter table test_region truncate partition P{0}".format(pos) # 清除当前分区数据
        cursor.execute(sql)
        sql = u"insert into test_region(create_date) values('{0}')".format(currdate.strftime('%Y-%m-%d')) # 新增当日数据
        cursor.execute(sql)
        conn.commit()
        currdate = currdate + datetime.timedelta(days = 1) # 当前日期加一天
conn.close()

3)结果分析
当前日期:2019/03/02

select id, create_date from test_region order by id;
id create_date
94 2019-06-03
95 2019-06-04
96 2019-06-05
97 2019-06-06
98 2019-06-07
99 2019-06-08
100 2019-06-09

查询执行计划:

explain partitions select id, create_date from test_region where create_date = '2019-06-06'
id select_type table partitions type rows Extra
1 SIMPLE test_region P5 ALL 2 Using where

查询分区信息:

alter table test_region analyze partition all; # 更新分区表统计信息
select 
  partition_name part,  
  partition_expression expr,  
  partition_description descr,  
  table_rows  
from information_schema.partitions  where 
  table_schema = schema()  
  and table_name='test_region'; 
part expr descr table_rows
P1 dayofweek(create_date) 1 1
P2 dayofweek(create_date) 2 1
P3 dayofweek(create_date) 3 1
P4 dayofweek(create_date) 4 1
P5 dayofweek(create_date) 5 1
P6 dayofweek(create_date) 6 1
P7 dayofweek(create_date) 7 1

猜你喜欢

转载自blog.csdn.net/m0_37261091/article/details/88074635