实验用的的表如下
show create table 20130208load; CREATE TABLE `20130208load` ( `id` int(11) NOT NULL AUTO_INCREMENT, `uid` int(11) NOT NULL, `content` varchar(50) NOT NULL, PRIMARY KEY (`id`), KEY `20130208load_idx_uid` (`uid`) ) ENGINE=InnoDB
表里有30w数据
导出为outfile的方式为
SELECT * INTO OUTFILE 'F:\\temp\\20130122handler.txt' FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' FROM 20130122handler;
导出的文件7M,里面就是逗号分割的数据
导出为insert文件的方式为
.\mysqldump.exe -u root -proot crap 20130208load > F:\temp\20130208load.sql
导出的文件8M,可以看到它导出的insert语句是insert into values()()()......的形式
下面准备导入到另一个数据allcrap_test中
1 insert 方法
mysql> use allcrap_test;
Database changed
mysql> set autocommit=0;
Query OK, 0 rows affected (0.00 sec)
mysql> source F:\\temp\\20130208load.sql
Query OK, 36537 rows affected (1.34 sec)
Records: 36537 Duplicates: 0 Warnings: 0
Query OK, 35535 rows affected (1.17 sec)
Records: 35535 Duplicates: 0 Warnings: 0
Query OK, 35537 rows affected (1.19 sec)
Records: 35537 Duplicates: 0 Warnings: 0
Query OK, 35536 rows affected (1.20 sec)
Records: 35536 Duplicates: 0 Warnings: 0
Query OK, 35544 rows affected (1.20 sec)
Records: 35544 Duplicates: 0 Warnings: 0
Query OK, 35540 rows affected (1.08 sec)
Records: 35540 Duplicates: 0 Warnings: 0
Query OK, 35542 rows affected (1.17 sec)
Records: 35542 Duplicates: 0 Warnings: 0
Query OK, 35534 rows affected (1.19 sec)
Records: 35534 Duplicates: 0 Warnings: 0
Query OK, 14695 rows affected (0.56 sec)
Records: 14695 Duplicates: 0 Warnings: 0
分为9次insert,差不多有10s
这里其实完全可以先把secondary索引去掉,然后插入,然后重建索引.
首先改下sqldump
.\mysqldump.exe --no-create-info -u root -proot crap 20130208load > F:\temp\20130208load_2.sql
加上了--no-create-info选项,没有导出DDL语句
这次先drop掉那个索引
alter table 20130208load drop index 20130208load_idx_uid
清空表(这是为了删除第一次导入的数据)
TRUNCATE table 20130208load
然后source命令
mysql> source F:\\temp\\20130208load_2.sql;
Query OK, 36537 rows affected (0.75 sec)
Records: 36537 Duplicates: 0 Warnings: 0
Query OK, 35535 rows affected (0.73 sec)
Records: 35535 Duplicates: 0 Warnings: 0
Query OK, 35537 rows affected (1.13 sec)
Records: 35537 Duplicates: 0 Warnings: 0
Query OK, 35536 rows affected (0.64 sec)
Records: 35536 Duplicates: 0 Warnings: 0
Query OK, 35544 rows affected (0.66 sec)
Records: 35544 Duplicates: 0 Warnings: 0
Query OK, 35540 rows affected (0.64 sec)
Records: 35540 Duplicates: 0 Warnings: 0
Query OK, 35542 rows affected (0.67 sec)
Records: 35542 Duplicates: 0 Warnings: 0
Query OK, 35534 rows affected (0.92 sec)
Records: 35534 Duplicates: 0 Warnings: 0
Query OK, 14695 rows affected (0.33 sec)
Records: 14695 Duplicates: 0 Warnings: 0
这次明显快多了,大概6.47s.
最后把secondary索引加上去就行了
alter table 20130208load add index 20130208load_idx_uid(uid);
2 然后试下load data infile
创建用于load data infile的表
create table 20130208load_infile like crap.20130208load
有索引的情况下load
load data INFILE 'F:\\temp\\20130208load.txt' INTO table 20130208load_infile FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'
9s
试下没有索引时候,用同样的load语句,花了5.2s多.
另外看了这篇文章
http://venublog.com/2007/11/07/load-data-infile-performance/
谈到了一些优化,在我的windows机器上,innodb_flush_method是无效的,
设置了innodb_doublewrite=0以后,变成了4.6s
这里有人谈到了,如果把load data infile的大文件拆分成若干个小文件,会不会提高性能.
从他的结果来看,貌似提高不明显.
http://www.mysqlperformanceblog.com/2008/07/03/how-to-load-large-files-safely-into-innodb-with-load-data-infile/
另外吐槽下Navicat里执行source命令居然报语法错误 Err 1064,还是在mysql自带的客户端执行能好使.