Can the size of SQL transactions exported by mysqldump be controlled?

MySQL's mysqldump is a commonly used data export tool. It can follow many types of parameters to meet different needs. This article from the technical community "Technology Sharing | Controlling the Transaction Size of SQL Files Exported by Mysqldump" can help you understand the transaction . related configuration.

background

Someone asked whether the insert statements produced by mysqldump can be organized in the form of one insert statement for every 10 rows.

Thought 1: Parameter --extended-insert

Looking back on what I have learned in the past, I only know that there are a pair of parameters,

--extended-insert (default)

Indicates using long INSERT and merging multiple rows together to batch INSERT to improve import efficiency.

--skip-extended-insert

Short INSERTs, one per line

None of them meet the needs of group members, and it is impossible to control the organization in the form of one insert statement for every 10 rows.

Thought 2: “Avoid Big Deals”

I have never considered this issue before. I believe this issue was raised mainly to "avoid big events." Therefore, all inserts are small transactions.

Next, let’s discuss the following questions,

1. What is a big deal?

2. Could the insert statement from mysqldump be a large transaction?

What is a big deal?

  • Definition: Transactions that run for a long time and operate on a large amount of data are called large transactions.

  • Big business risks,

    ∘ Locking too much data causes a lot of blocking and lock timeouts, and rollback takes a long time.

    ∘ Long execution time, easily causing master-slave delay.

    ∘ undo log expansion

  • Avoid large transactions: According to the company's actual scenario, it is stipulated that the amount of data per operation/retrieval should be less than 5,000, and the result set should be less than 2M.

Are there any large transactions in the SQL files produced by mysqldump?

Premise, MySQL is self-committed by default, so if a transaction is not explicitly enabled, an SQL statement is a transaction. In mysqldump, a SQL statement is a transaction.

According to the "avoid large transactions" custom rule, the answer is no .

It turns out that mysqldump will automatically split SQL statements according to the parameter --net-buffer-length. The default value is 1M. According to the standards we defined earlier, our large transaction standard of 2M has not been reached.

--net-buffer-length can be set to a maximum of 16777216. If a manual setting is greater than this value, it will be automatically adjusted to 16777216, which is 16M. Setting 16M can improve export and import performance. If you want to avoid large transactions, it is not recommended to adjust this parameter and just use the default value.

[root@192-168-199-198 ~]# mysqldump --net-buffer-length=104652800 -uroot -proot -P3306 -h192.168.199.198 test t >16M.sql
mysqldump: [Warning] option 'net_buffer_length': unsigned value 104652800 adjusted to 16777216
#设置大于16M,参数被自动调整为16M

Note that this refers to the parameters of mysqldump, not the parameters of mysqld. The official documentation mentions: If you increase this variable, ensure that the MySQL server net_buffer_length system variable has a value at least this large.

This means that if mysqldump increases this value, mysqld must also increase this value. The test conclusion is not necessary. I suspect the official documentation is wrong.

However, when importing, it is affected by the server parameter max_allowed_packet, which controls the maximum size of data packets that the server can accept. The default value is 4194304, which is 4M. Therefore, you need to adjust the value of the parameter max_allowed_packet when importing the database.

setglobal max_allowed_packet=16*1024*1024*1024;

If you do not adjust it, the following error will appear:

[root@192-168-199-198 ~]# mysql -uroot -proot -P3306 -h192.168.199.198 test <16M.sql
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 2006 (HY000) at line 46: MySQL server has gone away

Related tests

Relevant test steps,

mysql> select version();
+------------+
| version()  |
+------------+
| 5.7.26-log |
+------------+
1 row in set (0.00 sec)

Create 1 million rows of data,

create database test;
use test;
CREATE TABLE `t` (
  `a` int(11) DEFAULT NULL,
  `b` int(11) DEFAULT NULL,
  `c` varchar(255) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
insert into t values (1,1,'abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyztuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz');
insert into t select * from t; #重复执行20次
# 直到出现Records: 524288  Duplicates: 0  Warnings: 0
# 说明数据量达到100多万条了。
mysql> select count(*) from t;
+----------+
| count(*) |
+----------+
|  1048576 |
+----------+
1 row in set (1.04 sec)

The data size is as follows, 287MB,

[root@192-168-199-198 test]# pwd
/data/mysql/mysql3306/data/test
[root@192-168-199-198 test]# du -sh t.ibd
287M    t.ibd

--net-buffer-length=1M

[root@192-168-199-198 ~]# mysqldump -uroot -proot -S /tmp/mysql3306.sock test t >1M.sql
[root@192-168-199-198 ~]# du -sh 1M.sql
225M    1M.sql
[root@192-168-199-198 ~]# cat 1M.sql |grep -i insert |wc -l
226

By default --net-buffer-length=1M, there are 226 inserts in the 225M SQL file. On average, the SQL size of each insert is 1M.

--net-buffer-length=16M

[root@192-168-199-198 ~]# mysqldump --net-buffer-length=16M -uroot -proot -S /tmp/mysql3306.sock test t >16M.sql
[root@192-168-199-198 ~]# du -sh 16M.sql
225M    16M.sql
[root@192-168-199-198 ~]# cat 16M.sql |grep -i insert |wc -l
15

By default --net-buffer-length=16M, there are 15 inserts in the 225M SQL file. On average, the SQL size of each insert is 16M.

So, here is proof that --net-buffer-length can indeed be used to split the SQL size of the mysqldump backup file.

Performance Testing

The greater the number of inserts, the greater the number of interactions and the lower the performance. However, given that the number of inserts in the above example is not very different, only 16 times, the performance difference will not be huge (the same is true for actual tests). We directly compare the situations of --net-buffer-length=16K and --net-buffer-length=16M. The difference in the number of inserts is 1024 times.

[root@192-168-199-198 ~]# time mysql -uroot -proot -S /tmp/mysql3306.sock test <16K.sql
mysql: [Warning] Using a password on the command line interface can be insecure.
real    0m10.911s  #11秒
user    0m1.273s
sys    0m0.677s
[root@192-168-199-198 ~]# mysql -uroot -proot -S /tmp/mysql3306.sock -e "reset master";
mysql: [Warning] Using a password on the command line interface can be insecure.
[root@192-168-199-198 ~]# time mysql -uroot -proot -S /tmp/mysql3306.sock test <16M.sql
mysql: [Warning] Using a password on the command line interface can be insecure.
real    0m8.083s  #8秒
user    0m1.669s
sys    0m0.066s

The results are obvious. The larger the --net-buffer-length setting, the fewer interactions the client has with the database and the faster the import.

in conclusion

The backup files exported under the default settings of mysqldump meet the import requirements and will not cause large transactions. The performance also meets the requirements and no parameters need to be adjusted.

Reference link,

https://dev.mysql.com/doc/refman/5.7/en/mysqldump.html#option_mysqldump_net-buffer-length

If you think this article is helpful, please feel free to click "Like" and "Reading" at the end of the article, or forward it directly to pyq,

042b5c3ae658fa8fdfc82f2a73c3ecd9.png

Recently updated articles:

" MySQL remote login prompts Access denied scenario "

" Usage scenarios of JDBC connection parameter useCursorFetch "

" Scenarios of Index Creation Error in MySQL "

" MySQL character set conversion operation scenario "

" Financial Knowledge - Secondary Market "

Recent hot articles:

" Recommend a classic paper on Oracle RAC Cache Fusion "

" The shock that the open source code of the "Red Alert" game brings to us "

Article classification and indexing:

Classification and indexing of 1,300 articles on public accounts

Guess you like

Origin blog.csdn.net/bisal/article/details/133191275