mysql massive data insertion optimization

1. One SQL statement inserts multiple pieces of data

The modified insert operation can improve the insert efficiency of the program. The main reasons for the high execution efficiency of the second type of SQL here are: (1) By merging SQL statements, it can also reduce the number of SQL statement parsing and reduce the I/O overhead of database connections. Generally, multiple data insertions are placed in the One SQL statement is executed at one time; (2) After the merger, the log volume (MySQL's binlog and innodb's transaction log) is reduced, reducing the data volume and frequency of log flushing, thereby improving efficiency.
 

If the amount of data is too large to insert in batches, the following situations may occur:

  MySQL error: Packets larger than max_allowed_packet are not allowed (solved by modifying the value of max_allowed_packet, show VARIABLES like '%max_allowed_packet%';)

2. Insert processing in the transaction.

START TRANSACTION;
INSERT INTO `insert_table` (`datetime`, `uid`, `content`, `type`)  VALUES ('0', 'userid_0', 'content_0', 0);
INSERT INTO `insert_table` (`datetime`, `uid`, `content`, `type`)  VALUES ('1', 'userid_1', 'content_1', 1);
...
COMMIT;

  The use of transactions can improve the efficiency of data insertion, because when an INSERT operation is performed, MySQL will create a transaction internally, and the actual insertion processing operation will be performed within the transaction. The consumption of creating a transaction can be reduced by using a transaction, 所有插入都在执行后才进行提交操作.

Precautions:

  1. SQL语句是有长度限制, in the same SQL for data merging, it must not exceed the SQL length limit, which can be modified through max_allowed_packet configuration, the default is 1M, and it is modified to 8M during testing.

  2. 事务需要控制大小, the transaction is too large may affect the efficiency of execution. MySQL has the innodb_log_buffer_size configuration item. If this value is exceeded, the innodb data will be flushed to the disk. At this time, the efficiency will decrease. Therefore, it is better to commit the transaction before the data reaches this value.

4. Close binlog and general-log

5. Other parameters

bulk_insert_buffer_size=100M 
innodb_flush_log_at_trx_commit=2; #Prohibit synchronizing logs to disk from time to time

  8. Adjust the innodb_autoextend_increment configuration from the default 8M to 128M. innodb_autoextend_increment=128
         This configuration item is mainly used when the tablespace space is full, how much space the MySQL system needs to automatically expand, and each tablespace expansion will make each SQL in a waiting state. Increasing the automatic expansion Size can reduce the number of tablespace automatic expansion.
     9. Adjust the innodb_log_buffer_size configuration to 128M due to the default 1M. innodb_log_buffer_size=128M
         This configuration item is used to set the innodb database engine to write the log cache area; increasing this cache segment can reduce the number of times the database writes data files.
    10. Adjust the innodb_log_file_size configuration to 128M due to the default 8M innodb_log_file_size=128M
         This configuration item is used to set the size of the UNDO log of the innodb database engine; thereby reducing the database checkpoint operation.

max_allowed_packet=1073741824
bulk_insert_buffer_size=100M
innodb_log_buffer_size=128M
innodb_log_file_size=128M
innodb_autoextend_increment=100
innodb_flush_method=O_DIRECT
innodb_io_capacity=2000
innodb_io_capacity_max=20000
innodb_flush_log_at_trx_commit=2

6. Combined with the machine situation

2 cpu 8g memory. The number of processes is set to 2 pages with 100,000 entries per page, and max_allowed_packet=60M is increased. It took 15 minutes for 5.28 million pieces of data.

In the case of a large amount of data, it mainly solves the time-consuming problem of paging. In the case of insufficient cpu, it will be slower to open more threads.

Guess you like

Origin blog.csdn.net/yonghutwo/article/details/123782290