Application scenarios
In the process of mysql database operation and maintenance, there will always be some tricky things, and historical data archiving is definitely the last one. Due to some historical reasons, some business tables were originally designed as single tables, and there were no partitions. After running for a period of time, the business was found to be slower and slower. After investigation, it was found that there were too many data in these single tables, resulting in low query efficiency. At this time, some historical data that was not used by the business should be archived to reduce the amount of data in the table and improve the query efficiency.
However, it is not an easy task to archive these historical data smoothly. Note that it is silky and cannot stop the business and cannot affect the online business.
The above is the need for historical data archiving. To solve the above problems, there is a tool pt-archiver in the percona-toolkits tool set, which can perfectly solve your needs.
pt-archiver function introduction
pt-archiver has the following functions:
1. Export online data into archive files
according to filter conditions; 2. According to filter conditions, clean up expired online historical data;
3. According to filter conditions, clean up expired data and archive the data to Local archive table, or history table of remote archive server
pt-archiver usage restrictions
There is only one restriction when using the pt-archiver tool. The table to be archived must have a primary key.
Introduction to common parameters of pt-archiver
--where 'id<1000' 设置操作条件
--limit 10000 每次取1000行数据给pt-archive处理
--txn-size 1000 设置1000行为一个事务提交一次
--progress 5000 每处理5000行输出一次处理信息
--charset=UTF8 指定字符集为UTF8
--no-delete 表示不删除原来的数据,注意:如果不指定此参数,所有处理完成后,都会清理原表中的数据
--bulk-delete 批量删除source上的旧数据
--bulk-insert 批量插入数据到dest主机 (看dest的general log发现它是通过在dest主机上LOAD DATA LOCAL INFILE插入数据的)
--purge 删除source数据库的相关匹配记录
pt-archiver usage scenario simulation
Online library business simulation
mysql> show create table sbtest1\G;
*************************** 1. row ***************************
Table: sbtest1
Create Table: CREATE TABLE `sbtest1` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`k` int(10) unsigned NOT NULL DEFAULT '0',
`c` char(120) NOT NULL DEFAULT '',
`pad` char(60) NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
KEY `k_1` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=100001 DEFAULT CHARSET=utf8 MAX_ROWS=1000000
1 row in set (0.02 sec)
mysql> select count(*) from sbtest1;
+----------+
| count(*) |
+----------+
| 100000 |
+----------+
1 row in set (0.29 sec)
History library simulation
mysql> show create table arch_sbtest1\G;
*************************** 1. row ***************************
Table: arch_sbtest1
Create Table: CREATE TABLE `arch_sbtest1` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`k` int(10) unsigned NOT NULL DEFAULT '0',
`c` char(120) NOT NULL DEFAULT '',
`pad` char(60) NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
KEY `k_1` (`k`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.02 sec)
Export historical data to file
Export the historical data of sbtest1 to a file without deleting the original table records. Without -no-delete, the original table records are deleted by default
[mysql@localhost backup]$ pt-archiver --source A=utf8,u=tony,p=tony,h=192.168.17.128,P=3308,D=sbtest,t=sbtest1 --file=/home/mysql/backup/%Y-%m-%d-%D.%t --where="id <10000" --no-delete --progress=100 --limit 100 --statistics
TIME ELAPSED COUNT
2020-09-01T05:28:10 0 0
2020-09-01T05:28:10 0 100
2020-09-01T05:28:10 0 200
2020-09-01T05:28:10 0 300
2020-09-01T05:28:10 0 400
2020-09-01T05:28:10 0 500
2020-09-01T05:28:10 0 600
.......
2020-09-01T05:28:13 2 9800
2020-09-01T05:28:13 2 9900
2020-09-01T05:28:13 2 9999
Started at 2020-09-01T05:28:10, ended at 2020-09-01T05:28:13
Source: A=utf8,D=sbtest,P=3308,h=192.168.17.128,p=...,t=sbtest1,u=tony
SELECT 9999
INSERT 0
DELETE 0
Action Count Time Pct
commit 10000 1.1150 38.55
select 101 0.6442 22.27
print_file 9999 0.1509 5.22
other 0 0.9827 33.97
-Limit 100, as can be seen from the above results, 100 pieces of data are retrieved from the source database each time.
Delete specified condition records
According to the filter conditions, delete the expired historical data, here –bulk-delete, which means the method of batch deletion
[mysql@localhost backup]$ pt-archiver --source A=utf8mb4,u=tony,p=tony,h=192.168.17.128,P=3308,D=sbtest,t=sbtest1 --purge --where="id<=10000" --progress=500 --limit 1000 --txn-size 500 --bulk-delete --statistics
TIME ELAPSED COUNT
2020-09-01T05:33:24 0 0
2020-09-01T05:33:24 0 500
2020-09-01T05:33:24 0 1000
2020-09-01T05:33:24 0 1500
2020-09-01T05:33:24 0 2000
2020-09-01T05:33:24 0 2500
2020-09-01T05:33:24 0 3000
2020-09-01T05:33:24 0 3500
2020-09-01T05:33:24 0 4000
2020-09-01T05:33:24 0 4500
2020-09-01T05:33:24 0 5000
2020-09-01T05:33:24 0 5500
2020-09-01T05:33:24 0 6000
2020-09-01T05:33:24 0 6500
2020-09-01T05:33:25 0 7000
2020-09-01T05:33:25 0 7500
2020-09-01T05:33:25 1 8000
2020-09-01T05:33:25 1 8500
2020-09-01T05:33:25 1 9000
2020-09-01T05:33:25 1 9500
2020-09-01T05:33:25 1 10000
2020-09-01T05:33:25 1 10000
Started at 2020-09-01T05:33:24, ended at 2020-09-01T05:33:25
Source: A=utf8mb4,D=sbtest,P=3308,h=192.168.17.128,p=...,t=sbtest1,u=tony
SELECT 10000
INSERT 0
DELETE 10000
Action Count Time Pct
bulk_deleting 10 0.8056 60.49
commit 21 0.1089 8.18
select 11 0.0567 4.26
other 0 0.3606 27.08
Migrate historical data to a remote database
To migrate historical data to a remote database, a table must be created in the target database
[mysql@localhost backup]$ pt-archiver --source A=utf8,u=tony,p=tony,h=192.168.17.128,P=3308,D=sbtest,t=sbtest1 --dest A=utf8,u=root,p=root,h=172.17.0.3,P=3306,D=testdb,t=arch_sbtest1 --where="id<20000" --progress=500 --limit 1000 --txn-size 500 --bulk-delete --bulk-insert --statistics
# A software update is available:
TIME ELAPSED COUNT
2020-09-01T05:37:51 0 0
2020-09-01T05:37:51 0 500
2020-09-01T05:37:51 0 1000
2020-09-01T05:37:51 0 1500
2020-09-01T05:37:51 0 2000
2020-09-01T05:37:52 0 2500
2020-09-01T05:37:52 0 3000
2020-09-01T05:37:52 0 3500
2020-09-01T05:37:52 0 4000
2020-09-01T05:37:52 0 4500
2020-09-01T05:37:52 0 5000
2020-09-01T05:37:52 1 5500
2020-09-01T05:37:52 1 6000
2020-09-01T05:37:52 1 6500
2020-09-01T05:37:52 1 7000
2020-09-01T05:37:53 1 7500
2020-09-01T05:37:53 1 8000
2020-09-01T05:37:53 1 8500
2020-09-01T05:37:53 1 9000
2020-09-01T05:37:53 1 9500
2020-09-01T05:37:53 1 9999
Started at 2020-09-01T05:37:51, ended at 2020-09-01T05:37:53
Source: A=utf8,D=sbtest,P=3308,h=192.168.17.128,p=...,t=sbtest1,u=tony
Dest: A=utf8,D=testdb,P=3306,h=172.17.0.3,p=...,t=arch_sbtest1,u=root
SELECT 9999
INSERT 9999
DELETE 9999
Action Count Time Pct
bulk_inserting 10 0.5509 28.50
bulk_deleting 10 0.2252 11.65
commit 40 0.1490 7.71
select 11 0.0957 4.95
print_bulkfile 9999 -0.0099 -0.51
other 0 0.9222 47.70
Query the archive table in the history database
mysql> select count(*) from arch_sbtest1;
+----------+
| count(*) |
+----------+
| 9999 |
+----------+
1 row in set (0.01 sec)
You can see that it has been archived to the history table.