sqldrr batch import and export data test

sqlldr is a recommended method for processing large data volumes. It has many performance switches, which can minimize the generation of redo and undo, and control the processing methods of data (insert, append, replace, truncate)
because The project needs, the performance of datapump is still not ideal, so I still hope to use sqlldr to do it. Personally did a simple test.
According to the introduction of thomas kyte, the fastest way to load parallel execution paths is to directly write only formatted data blocks, minimizing the generation of redo and undo.

First write the following script. Metadata can be dynamically generated from a user's table.

sqlplus -s $1 < set pages 0
col object_name format a30
set linseize 10000
set feedback off
set colsep ','
spool $2.lst
select *from $2 ;
spool off; The data generated after
EOF

is roughly as follows.
[ora11g@rac1 sqlldr]$ ksh spooldata.sh n1/n1 t
    370753, 10205,KU$_DOMIDX_OBJNUM_VIEW ,VIEW
    370754, 10207,KU$_OPTION_OBJNUM_T ,TYPE
    370755,     10208,KU$_EXPREG                    ,VIEW
    370756,     10210,SYS_YOID0000010209$           ,TYPE
    370757,     10209,KU$_OPTION_OBJNUM_VIEW        ,VIEW
    370758,     10211,KU$_OPTION_VIEW_OBJNUM_VIEW   ,VIEW
    370759,     10212,KU$_MARKER_T                  ,TYPE
    370760,     10214,SYS_YOID0000010213$           ,TYPE
    370761,     10213,KU$_MARKER_VIEW               ,VIEW
    370762,     10215,KU$_TABPROP_VIEW              ,VIEW
    370763,     10216,KU$_PFHTABPROP_VIEW           ,VIEW
    370764,     10217,KU$_REFPARTTABPROP_VIEW       ,VIEW
    370765,     10218,KU$_MVPROP_VIEW               ,VIEW
    370766,     10219,KU$_MVLPROP_VIEW              ,VIEW
    370767, 10220,KU$_TTS_VIEW ,VIEW
    370768, 10221,KU$_TAB_TS_VIEW ,VIEW
    370769, 10222,KU$_TTS_IND_VIEW ,VIEW
    370770, 10223,KU$_IND_TS_VIEW ,VIEW
    370771, 10224 VIEW

Then prepare the control file .ctl, load data from t to tt.
load data 
into table tt
fields terminated by ','
(id,object_id,object_name,object_type)

try to import:
[ora11g@rac1 sqlldr]$ sqlldr n1/n1 control=sqlldr.ctl data=t.lst 
SQL*Loader: Release 11.2 .0.3.0 - Production on Tue May 27 08:09:25 2014
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
But no feedback.
查看自动生成的sqlldr.log
里面有如下的错误。

   Column Name                  Position   Len  Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ---------------------
ID                                  FIRST     *   ,       CHARACTER            
OBJECT_ID                            NEXT     *   ,       CHARACTER            
OBJECT_NAME                          NEXT     *   ,       CHARACTER            
OBJECT_TYPE                          NEXT     *   ,       CHARACTER            


Record 1: Rejected - Error on table TT, column OBJECT_TYPE.
Field in data file exceeds maximum length
Record 2: Rejected - Error on table TT, column OBJECT_TYPE.
Field in data file exceeds maximum length
Record 3: Rejected - Error on table TT, column OBJECT_TYPE.
Field in data file exceeds maximum length
Record 4: Rejected - Error on table TT, column OBJECT_TYPE.
Field in data file exceeds maximum length

After trying for a while, finally found set linesize When the length is set relatively large, when parsing according to the comma ',', the length of the last field includes the remaining spaces, and when it is finally loaded, it will be found that its length is too large. The length of the table definition has been exceeded.
In this case, I can't always specify the length one by one.
At this time, I thought of the function of trimspool, and the attempt really worked.
The script content of spooldata.sh is as follows:
sqlplus -s $1 < set pages 0
col object_name format a30
set linesize 10000
set trimspool on
set feedback off
set colsep ','
spool $2.lst
select *from $2 where rownum<20 ;
spool off;
EOF


tried the import again and there was no problem.
[ora11g@rac1 sqlldr]$ sqlldr n1/n1 control=sqlldr.ctl data=t.lst
SQL*Loader: Release 11.2.0.3.0 - Production on Tue May 27 08:14:44 2014
Copyright (c) 1982, 2011 , Oracle and/or its affiliates. All rights reserved.
Commit point reached - logical record count 19

So far, let's start to see how much performance improvement the direct method has
. The test situation for nearly 800,000 data volumes is as follows.
When the direct method is not used, there will be a certain frequency (default 50 records once) to load data, which takes 79 seconds, basically 10,000 records per second
Commit point reached - logical record count 793480
Commit point reached - logical record count 793544
Commit point reached - logical record count 793608
Commit point reached - logical record count 793672
Commit point reached - logical record count 793736
Commit point reached - logical record count 793800
Commit point reached - logical record count 793864
Commit point reached - logical record count 793928
Commit point reached - logical record count 793992
Commit point reached - logical record count 794056
Commit point reached - logical record count 794120
Commit point reached - logical record count 794184
Commit point reached - logical record count 794248
Commit point reached - logical record count 794312
Commit point reached - logical record count 794369

But when direct=true is used, the speed is significantly improved, and the output is also very simple, just the following line. It takes 8 seconds, basically 100,000 pieces of data per second.
8s
[ora11g@rac1 sqlldr]$ sqlldr n1/n1 direct=true control=sqlldr.ctl data=t.lst     
SQL*Loader: Release 11.2.0.3.0 - Production on Tue May 27 07:56:31 2014
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326396261&siteId=291194637