Oracle inserts large amounts of data efficiently

Requirement: data migration, query data from remote database and write it into production database.

 

When encountering problems, the data volume is tens of millions (specifically, the data of more than 60 million records), directly use insert into target table select * from data source table.

Because the target table itself has tens of millions of data, and there are primary keys and indexes, writing to the table is very slow.

After 5 or 6 hours of execution, it has not been completed, and the next day, plsqldev has been stuck.

Can only forcibly kill the plsqldev process.

Looking at the data again, none of it was written, and it was a bit crashing. . .

 

I found a method to efficiently insert a large amount of data online, the original address http://www.cnblogs.com/quanweiru/p/5325635.html

Well written and practical.

 

In the test library, how much faster can insert data be using the partition table.

Premise: In the test library environment, the tables do not have primary keys and indexes.

step1, create a new partition table in the test library

 

create table AC83_P
(
  aaz219 NUMBER(16) not null,
  aaz220 NUMBER(16),
  aaa027 VARCHAR2(12),
  aac001 NUMBER(20),
  aaa036 VARCHAR2(6),
  aaa038 VARCHAR2(3),
  aaa085 VARCHAR2(1),
  aaa088 VARCHAR2(1),
  aae140 VARCHAR2(6),
  aae002 NUMBER(6),
  aae003 NUMBER(6),
  aae019 NUMBER(16,2),
  bae134 NUMBER(16,2),
  aae013 VARCHAR2(150),
  baz057 NUMBER(16),
  baa018 NUMBER(20) not null,
  bad709 VARCHAR2(20),
  bae023 VARCHAR2(9),
  bad305 VARCHAR2(20)
) partition by hash(aaz219)(
  partition part01 tablespace data01,
  partition part02 tablespace data01,
  partition part03 tablespace data01,
  partition part04 tablespace data01,
  partition part05 tablespace data01,
  partition part06 tablespace data01,
  partition part07 tablespace data01,
  partition part08 tablespace data01
);

step2, write table ac83 (66,325,831 records) to the partition table.

 

 

alter table ac83 nologging;
insert /*+ append */ into ac83_p select * from ac83;--2分钟
alter table ac83 logging;
commit;

 The results startled me, this is too! quick! ! Bar! . 2 minutes, more precisely 116 seconds to complete the entire write.

 

Suddenly there was a feeling of crying with joy. Seems to see the dawn of victory

ps: +append is only applicable to serial, parallel will generate enqueue.

--====================================================================================================================================================================================== ===========

==>> But here comes the problem:

1) Migrate to the target database. The table of the target database itself has records, and there are primary keys and indexes, which cannot be deleted.

2) The table of the target library is not a partition table.

 

==>>New ideas:

According to the blog post tips, it can be written in a suitable way. The data source table is partitioned. Insert into the target library for each partition separately.

However, when executing the partition write, an error is reported

 

SELECT
*
FROM AC83_p@dblink_zhdata partition(part01);

 

 

ORA-14100: partition extended table name cannot refer to a remote object

 

 

分析出错原因: 原文地址 http://blog.csdn.net/annicybc/article/details/852561

测试发现虽然通过建立远端对象同义词的方式可以使用PARTITION语句,但是PARTITION语句并没有起任何作用。而且在最后的查询中,指定了一个不存在的分区,但是并没有报错,说明Oracle忽略了PARTITION语句。

 

说明oracle还是不能通过数据库链进行PARTITION相关的操作,但是如果对同义词采用这种方式的查询,则Oracle没有进行相应的判断,而仅仅是忽略分区语句。

 

很遗憾,不能通过数据源表 改为分区表,分别insert的办法进行并行写入到目标库。

--============方法2:将分区表按分区拆成几个普通表,并行写入到目标库=======

step1,按分区进行拆分表,新建分表

create table ac83_p1 as SELECT * FROM ac83_p partition(part01);
create table ac83_p2 as SELECT * FROM ac83_p partition(part02);
create table ac83_p3 as SELECT * FROM ac83_p partition(part03);
create table ac83_p4 as SELECT * FROM ac83_p partition(part04);
create table ac83_p5 as SELECT * FROM ac83_p partition(part05);
create table ac83_p6 as SELECT * FROM ac83_p partition(part06);
create table ac83_p7 as SELECT * FROM ac83_p partition(part07);
create table ac83_p8 as SELECT * FROM ac83_p partition(part08);

6千多万数据,分配到8张表中,每个表有8百多万记录。

 

step2, 尝试单独写入一个分表。

 

alter table ac83 nologging;
INSERT /*+ append */ INTO  AC83 
SELECT 
*
FROM AC83_p1@dblink_zhdata;
commit;
alter table ac83 logging;

 8百多万的数据,看需要多上时间,在写入表有主键和索引的情况下,从远端写入需要多长时间。

13分钟过去了,还没insert完。预估它需要1个小时。静待结果

 执行完了, 结果是 1488 s,约24分钟。比预估的还好。

 

但并行insert同一个表时,不能用/*+ append*/,会产生enqueue。

执行剩下的 7个分表

 

 写入过程中,提示空间不足了。

增加了表空间,写入前为 31G

--insert 前
SELECT a.tablespace_name,sum(bytes)/1024/1024 FROM dba_free_space a WHERE a.tablespace_name in ('GDYLSY_INDEX','GDYLSY_DATA') group by a.tablespace_name;
SELECT 34352398336/1024/1024/1024 FROM dual;--31G
GDYLSY_INDEX;--2554.56mb
GDYLSY_DATA;--32069.56mb--31.31G

 测试下 6千万条记录,占用表空间大概多少?

insert 完成查看剩余表空间

 

GDYLSY_DATA;--26949.56mb--26.31G

GDYLSY_INDEX;--604.125mb

 

--索引花掉 1950mb --1.9G

--数据记录花掉 5120mb --5G

6千万的数据,占用的空间大致是6-7G。

 

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326174933&siteId=291194637