pg copy命令导数

copy
在vim里面,用%s/|//g把所有的|符号替换掉，以，隔开字段
[root@localhost tmp]# cat a.txt
SanFrancisco,41,55,0,1994-11-29
xinzhou,32,32,0.11,2014-08-26
meiguo,25,65,0.58,2015-06-09
califonia,25,65,0.58,2013-04-09
shandong,25,65,0.58,2013-04-09
qingdao,85,20,0.64,2013-04-09

导出
tina=# copy weather to '/tmp/b.sql' delimiter ',';
导出指定的列
tina=# copy weather(city,temp_lo) to '/tmp/c.sql' delimiter ','; --这里with写可不写
导出成csv格式
tina=# copy weather(city,temp_lo) to '/tmp/c.sql' delimiter ':' csv header;
[root@localhost tmp]# cat c.sql
city:temp_lo   --有列名
San Francisco:41
xinzhou:32
meiguo:25
califonia:25
shandong:25
qingdao:85
SanFrancisco:41

导入
tina=# copy weather from '/tmp/a.txt' with delimiter ',';
COPY 6
tina=# select * from weather;
     city      | temp_lo | temp_hi | prcp |    date
San Francisco |      41 |      55 |    0 | 1994-11-29
xinzhou       |      32 |      32 | 0.11 | 2014-08-26
meiguo        |      25 |      65 | 0.58 | 2015-06-09
califonia     |      25 |      65 | 0.58 | 2013-04-09
shandong      |      25 |      65 | 0.58 | 2013-04-09
qingdao       |      85 |      20 | 0.64 | 2013-04-09

postgresql copy这个命令，用起来也很灵活，是迁移数据的一个好工具
Syntax:
COPY table_name [ ( column [, ...] ) ]
    FROM { 'filename' | STDIN }
    [ [ WITH ] ( option [, ...] ) ]
COPY { table_name [ ( column [, ...] ) ] | ( query ) }
    TO { 'filename' | STDOUT }
    [ [ WITH ] ( option [, ...] ) ]
where option can be one of:
    FORMAT format_name
    OIDS [ boolean ]
    DELIMITER 'delimiter_character'
    NULL 'null_string'
    HEADER [ boolean ]
    QUOTE 'quote_character'
    ESCAPE 'escape_character'
    FORCE_QUOTE { ( column [, ...] ) | * }
    FORCE_NOT_NULL ( column [, ...] ) |
    ENCODING 'encoding_name'
语法也是很简明的，介绍几个常用的选项
DELIMITER ：指定分隔符
HEADER：声明文件包含一个头标识，包含字段名称

===========
1、从数据库导出数据到文件标识符
命令：
\copy 表名 to 文件名|标准输出 [delimiter as ‘分隔符’] [null as ‘null表示的字符串’ ] [csv quote as ‘引号的类型’]

解释：
可以将表中的数据输出到文件或标准输出。

delimiter as ‘分隔符’ ：说明输出设备的文本中表的每个字段用什么字符分割，默认是tab；
null as ‘null表示的字符串’：说明输出设备的文本中表的NULL值的表示方法，默认为“\N”；
csv quote as ‘引号类型’ ：说明导出的csv文件中的引号类型是什么，
对于Postgres7.4以前的版本，不支持csv的导入与导出，这时不要使用这个可选项。
例子：
aa=#\copy testtable to data.csv delimiter as ‘,’ csv quote as ‘”‘

这条命令将testtable表中的内容作为SQL语句，导出data.csv文件中。
delimiter as ‘,’   ：说明data.txt文本中表的每个字段用“逗号”分割；
csv quote as ‘”‘ ：说明csv中的引号类型是“双引号”。
像前面说的：这个例子不适用在Postgres7.4以前的版本。

2、从文件标识符导入数据到数据库
命令：
\copy 表名 from 文件名|标准输入 [delimiter as ‘分隔符’] [null as ‘null表示的字符串’ ]   [csv quote as ‘引号的类型’]
例子：
aa=#\copy testtable from data.csv delimiter as ‘,’ csv quote as ‘”‘
这条命令将data.csv文件中的文本，作为sql语句导入到testtable表，
delimiter as ‘,’   ：说明data.txt文本中表的每个字段用“逗号”分割；
csv quote as ‘”‘ ：说明csv中的引号类型是“双引号”。

----插入大量数据对比-----
CREATE TABLE linshi_tina
(sample_hash character varying(41) NOT NULL)

[root@pg tina_shell]# cat a.sql
#!/bin/bash
psql -d tinadb -U postgres -c "insert into linshi_tina values('7CDB19E3DC5DF148B41CEFB4CD182F09');"
psql -d tinadb -U postgres -c "insert into linshi_tina values('65B4C5B18015A82F9EE49FFEC628664C');"
psql -d tinadb -U postgres -c "insert into linshi_tina values('E628646309E6E28CA27F8D15C5F02B6B');"
psql -d tinadb -U postgres -c "insert into linshi_tina values('D1B6994BF42E79433BE5872C7361F86B');"
psql -d tinadb -U postgres -c "insert into linshi_tina values('6DDEA8AAAD54412ACE381D49C17019CF');"

chmod +x a.sql
nohup ./a.sql &

24w行 insert耗时 27分钟 32G内存双核cpu

---想要插入的更快，应该使用copy命令
tm_samples=# copy linshi_tina from '/tmp/cml.txt';
COPY 216487

21w行，2s就操作完成了~~~~

[root@pg tmp]# head -5 cml.txt
231A840852611943F4810C69259112E1
8C24C421F8B2B1ADFCDC5AF02FB63A78
B079287F028C32D5E85B58CA8ABAA90A
88E75A42356727FF677623E8921806CB
4D5472FA110D7E4AB688C7FBD675D0B3

如果是两个字段的话，就用,隔开
[postgres@pg ~]$ psql
psql (9.1.1)
Type "help" for help.

postgres=# \c tina
You are now connected to database "tm_samples" as user "postgres".
tina=# copy linshi_tina from '/tmp/tmp.txt' with delimiter ',';
COPY 159831
tina=#

如果是只导入部分字段内容
tina=# copy t_test_wubao(sample_hash,sample_state,last_modified_time) from '/tmp/out_list2_wubao.txt' with delimiter ',';
COPY 8943
tina=#

注意，字符串不需要带上单引号，否则单引号也会导进去
15B7D437459115BF1DC2D176374F0DCC.4BA89099,0,2016-03-09 14:30:02
155052FE5F4B57620D249439B25382ED.4B8E6378,0,2016-03-09 14:30:02
0102455AC7E25EC362668E039C931F0B.9458EBDC,0,2016-03-09 14:30:02

select sample_hash from t_test_wubao where '%'''; 单引号转义-两个单引号

跨版本的导入：
9.1 导出：
[postgres@pg ~]$ psql -d tm_samples
psql (9.1.1)
Type "help" for help.

tina=# copy linshi_tina to '/tmp/linshi_tina.txt' with delimiter ',';
COPY 159831

9.4 创建表并导入：
CREATE TABLE linshi_tina
(
sample_hash character varying(41) NOT NULL,
sample_crc32 character varying
)
WITH (
OIDS=FALSE
);

[postgres@datanode1 ~]$ psql -p 1922 -d tina
psql (9.4.5)
Type "help" for help.

tina=# copy linshi_tina from '/tmp/linshi_tina.txt' with delimiter ',';
COPY 159831
tina=#

分区表的导出导入：
tina=# copy test_tina to '/tmp/test_tina20160309.txt' with delimiter ','; --总表里面无数据，需要去每一个分区导
COPY 0
tina=#

拼写sql：
select 'copy '||tablename|| ' to ''/tmp/'||tablename|| '.txt''' ||' with delimiter '','';' from pg_tables where schemaname='public' and tablename like 'test_tina20%' order by tablename;
copy test_tina20110701 to '/tmp/test_tina20110701.txt' with delimiter ',';
copy test_tina20110801 to '/tmp/test_tina20110801.txt' with delimiter ',';
copy test_tina20110901 to '/tmp/test_tina20110901.txt' with delimiter ',';

去后台执行copy命令，并将txt全部拷贝到目标主机：

清空所有的子分区表，否则数据会是追加！！
select 'truncate table '||tablename||';' from pg_tables where schemaname='public' and tablename like 'test_tina20%' order by tablename;
truncate table test_tina20110701;
truncate table test_tina20110801;
truncate table test_tina20110901;

select 'copy '||tablename|| ' from ''/tmp/'||tablename|| '.txt''' ||' with delimiter '','';' from pg_tables where schemaname='public' and tablename like 'test_tina20%' order by tablename;
"copy test_tina20110701 from '/tmp/test_tina20110701.txt' with delimiter ',';"
"copy test_tina20110801 from '/tmp/test_tina20110801.txt' with delimiter ',';"
"copy test_tina20110901 from '/tmp/test_tina20110901.txt' with delimiter ',';"

猜你喜欢