sql*loader for oracle data processing (2)



content

 

How SQL*Loader handles different files and formats

 

2.1 Excel file

The maximum number of lines in a general Excel file does not exceed 65536 lines, indicating that the amount of data processing is not large. The way to process Excel is to save it as a CSV file, and then import it in the normal way.

 

2.2 The files to be loaded are not comma separated

There are two ways to refer to:
1) Modify the data file and replace the delimiter with a comma.
2) Modify the control file and modify the value of FIELDS TERMINATED BY to the actual separator.

 

2.3 Delimiters are included in the data to be loaded

For example, to insert data into the scott.tb_loader table the provided data format is as follows:
SMITH,CLEAK,3904
ALLEN,"SALER,M",2891
WARD,"SALER,""S""",3128
KING,PRESIDENT,2523Modification
    Control file, pay attention to the bold characters in the following sample code, the OPTIONALLY ENCLOSED BY parameter indicates that the delimiter is double quotation marks (the default delimiter of CSV format files is double quotation marks, you can modify the parameter value of OPTIONALLY according to the actual situation), as shown below :

 

--control file  

[oracle@wjq SQL*Loader]$ vim wjq_test2.ctl 
LOAD DATA
INFILE '/u01/app/oracle/SQL*Loader/wjq_test2.dat'
TRUNCATE INTO TABLE tb_loader
FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY '"'  
(ENAME,JOB,SAL)

  

--data files  

[oracle@wjq SQL*Loader]$ vim wjq_test2.dat 
SMITH,CLEAK,3904
ALLEN,"SALER,M",2891
WARD,"SALER,""S""",3128
KING,PRESIDENT,2523sqlldr

 

Run the above code, and the query results are as follows:

[oracle@wjq SQL*Loader]$ sqlldr scott/tiger control=/u01/app/oracle/SQL*Loader/wjq_test2.ctl

SQL*Loader: Release 11.2.0.4.0 - Production on Tue Oct 31 14:56:40 2017

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

Commit point reached - logical record count 4

  

--search result  

SCOTT@seiang11g>select * from tb_loader;

ENAME      JOB              SAL       COMM
---------- --------- ---------- ----------
SMITH      CLEAK           3904
ALLEN      SALER,M         2891
WARD       SALER,"S"       3128
KING       PRESIDENT       2523

 

2.4 The data file has no delimiter

The following data files are professionally called fixed-length strings, and it is also easy to process fixed-length strings in sqlldr. For this example, we modify the control file as follows:

 

--control file  

[oracle@wjq SQL*Loader]$ vim wjq_test3.ctl
LOAD DATA
INFILE '/u01/app/oracle/SQL*Loader/wjq_test3.dat'
TRUNCATE INTO TABLE tb_loader
(
 ENAME position(1:5),
 JOB position(10:18),
 SAL position(23:26)
)

  

--data files  

[oracle@wjq SQL*Loader]$ vim wjq_test3.dat
SMITH    CLEAK        3904
ALLEN    SALESMAN     2891
WARD     SALESMAN     3128
KING     PRESIDENT    252

 

The position keyword is used to specify the start and end position of the column. For example, JOB position (10:18) refers to the column value from the 10th character to the 18th character as the column value of the ENAME column. The writing method of position is also very flexible. To achieve the above functions, it can also be replaced with the following forms:

①position(*+2:18): The way to directly specify the value is called the absolute offset. If you use the * sign, the professional term is called the relative offset, which means where the previous field ends, this time it starts from where, relative The offset can also be calculated again. For example, Position(*+2:15) means it starts from the position +2 where it ended last time.

②position(*) char(9): The advantage of this relative offset + type and length is that you only need to specify the starting position for the first column, and you only need to specify the column length for other columns, which is easier in actual use .

sqlldr runs the above code, and the query results are as follows:

 

--sqlldr command  

[oracle@wjq SQL*Loader]$ sqlldr scott/tiger control=/u01/app/oracle/SQL*Loader/wjq_test3.ctl

SQL*Loader: Release 11.2.0.4.0 - Production on Tue Oct 31 15:04:13 2017

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

Commit point reached - logical record count 4

  

--search result  

SCOTT@seiang11g>select * from tb_loader;

ENAME      JOB              SAL       COMM
---------- --------- ---------- ----------
SMITH      CLEAK           3904
ALLEN      SALESMAN        2891
WARD       SALESMAN        3128
KING       PRESIDENT        252

 

 

2.5 There are fewer columns in the data file than in the table to be imported

In the previous examples, the number of columns in the file is less than the number of columns in the table. But what if the missing column must be assigned a value? Just change the control file slightly, specify the COMM column directly, and assign the initial value 0 (the data in ldr_case3.dat is still quoted here):

--control file  

[oracle@wjq SQL*Loader]$ vim wjq_test4.ctl
LOAD DATA
INFILE '/u01/app/oracle/SQL*Loader/wjq_test3.dat' 
TRUNCATE INTO TABLE tb_loader
(
 ENAME position(1:5),
 JOB position(10:18),
 SAL position(23:26),
 COMM "0"
)

 

--sqlldr command  

[oracle@wjq SQL*Loader]$ sqlldr scott/tiger control=/u01/app/oracle/SQL*Loader/wjq_test4.ctl

SQL*Loader: Release 11.2.0.4.0 - Production on Tue Oct 31 15:08:50 2017

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

Commit point reached - logical record count 4

  

--View Results  

SCOTT@seiang11g>select * from tb_loader;

ENAME      JOB              SAL       COMM
---------- --------- ---------- ----------
SMITH      CLEAK           3904          0
ALLEN      SALESMAN        2891          0
WARD       SALESMAN        3128          0
KING       PRESIDENT        252          0

 

The value of COMM can also be determined according to the value of other columns, modify the control file as follows

 

--control file  

[oracle@wjq SQL*Loader]$ vim wjq_test5.ctl 
LOAD DATA
INFILE '/u01/app/oracle/SQL*Loader/wjq_test3.dat' 
TRUNCATE INTO TABLE tb_loader
(
 ENAME position(1:5),
 JOB position(10:18),
 SAL position(23:26),
 COMM "substr(:SAL,1,1)"
)

 

sqlldr executes the above code, the result is as follows, it is obvious that the value of COMM is obtained according to the first digit of the value of SAL

 

--sqlldr command  

[oracle@wjq SQL*Loader]$ sqlldr scott/tiger control=/u01/app/oracle/SQL*Loader/wjq_test5.ctl

SQL*Loader: Release 11.2.0.4.0 - Production on Tue Oct 31 15:12:00 2017

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

Commit point reached - logical record count 4

  

--Results of the  

SCOTT@seiang11g>select * from tb_loader;

ENAME      JOB              SAL       COMM
---------- --------- ---------- ----------
SMITH      CLEAK           3904          3
ALLEN      SALESMAN        2891          2
WARD       SALESMAN        3128          3
KING       PRESIDENT        252          2

 

Here the value of the COMM column is determined according to the value of the SAL column. We use a function substr in SQL to take the first column of the SAL value and assign it to the COMM column. Of course, this is just an example. The DBA can make appropriate changes according to actual needs. The functions in can realize many interesting transformations, which may save you a lot of effort, and if the existing functions cannot be implemented, you can even write custom functions through PL/SQL, and then call them in the control file of sqlldr, The calling method is exactly the same as that of the system's own functions, so that the columns to be loaded can be flexibly processed according to the requirements.

 

 

2.6 There are more columns in the data file than in the table to be imported

If the number of columns in the data file is less than that in the table to be imported, it may be more troublesome to process. If there are more columns, it is simpler. According to different situations, there are generally the following two processing methods:
Method 1: Modify the data file to remove the excess However, in this way, it is feasible for a small amount of data. Once the data file is large, hundreds or even gigabytes, it is time-consuming and labor-intensive to modify the data file.

Method 2: Use the control file FILLER in sqlldr to exclude unnecessary columns
1) The demo data file is as follows

--data files  

[oracle@wjq SQL*Loader]$ vim wjq_test6.dat
SMITH    7369   CLERK      1020   20  
ALLEN    7499   SALESMAN   1930   30  
WARD     7521   SALESMAN   1580   30  
JONES    7566   MANAGER    3195   20  
MARTIN   7654   SALESMAN   1580   30  
BLAKE    7698   MANAGER    3180   30  
CLARK    7782   MANAGER    2172   10  
SCOTT    7788   ANALYST    3220   20  
KING     7839   PRESIDENT  4722   10  
TURNER   7844   SALESMAN   1830   30  
ADAMS    7876   CLERK      1320   20  
JAMES    7900   CLERK      1280   30  
FORD     7902   ANALYST    3220   20  
MILLER   7934   CLERK      1022   10

At this point, our requirements want us to import columns 1, 3, and 4 and skip columns 2 and 5. Create a control file as follows

 

--control file  

[oracle@wjq SQL*Loader]$ vim wjq_test6.ctl 
LOAD DATA
INFILE '/u01/app/oracle/SQL*Loader/wjq_test6.dat' 
TRUNCATE INTO TABLE tb_loader
(
 ENAME position(1:6),
 COL1 FILLER position(10:13),
 JOB position(17:25),
 SAL position(28:31)
)

The control file of sqlldr supports the FILLER keyword when defining the column, which can be used to specify the filter column. In the above control file, we use this keyword to filter the column, which is equivalent to the data between the 10th and 13th columns. import.
In fact, since this is a fixed-length string, the position parameter we specified in the control file has limited the read content. You can even delete the line of TCOL FILLER position (10:13) in the control file.

Execute the sqlldr command: 

[oracle@wjq SQL*Loader]$ sqlldr scott/tiger control=/u01/app/oracle/SQL*Loader/wjq_test6.ctl

SQL*Loader: Release 11.2.0.4.0 - Production on Tue Oct 31 15:24:36 2017

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

Commit point reached - logical record count 14

  

--search result  

SCOTT@seiang11g>select * from tb_loader;

ENAME      JOB              SAL       COMM
---------- --------- ---------- ----------
SMITH      CLERK           1020
ALLEN      SALESMAN        1930
WARD       SALESMAN        1580
JONES      MANAGER         3195
MARTIN     SALESMAN        1580
BLAKE      MANAGER         3180
CLARK      MANAGER         2172
SCOTT      ANALYST         3220
KING       PRESIDENT       4722
TURNER     SALESMAN        1830
ADAMS      CLERK           1320
JAMES      CLERK           1280
FORD       ANALYST         3220
MILLER     CLERK           1022

 

 

 

2) If the strings in the data file are not in fixed-length format, but are processed by delimiters, then you need to pay attention to the control file, such as the data file as follows:

--data files  

[oracle@wjq SQL*Loader]$ vim wjq_test7.dat
SMITH,7369,CLERK,1020,20  
ALLEN,7499,SALESMAN,1930,30  
WARD,7521,SALESMAN,1580,30  
JONES,7566,MANAGER,3195,20  
MARTIN,7654,SALESMAN,1580,30  
BLAKE,7698,MANAGER,3180,30  
CLARK,7782,MANAGER,2172,10  
SCOTT,7788,ANALYST,3220,20  
KING,7839,PRESIDENT,4722,10  
TURNER,7844,SALESMAN,1830,30  
ADAMS,7876,CLERK,1320,20  
JAMES,7900,CLERK,1280,30  
FORD,7902,ANALYST,3220,20  
MILLER,7934,CLERK,1022,10

When creating a control file at this time, FILLER must be specified in the control file, otherwise the values ​​in the column may not correspond. Create a control file as follows

 

--control file  

[oracle@wjq SQL*Loader]$ vim wjq_test7.ctl 
LOAD DATA  
INFILE '/u01/app/oracle/SQL*Loader/wjq_test7.dat' 
TRUNCATE INTO TABLE tb_loader 
FIELDS TERMINATED BY "," 
(  
 ENAME,COL1 FILLER,JOB,SAL

 

Execute the sqlldr command and view the results

 

--sqlldr命令  
[oracle@wjq SQL*Loader]$ sqlldr scott/tiger control=/u01/app/oracle/SQL*Loader/wjq_test7.ctl

SQL*Loader: Release 11.2.0.4.0 - Production on Tue Oct 31 15:32:48 2017

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

Commit point reached - logical record count 14

  

--View Results  

SCOTT@seiang11g>select * from tb_loader;

ENAME      JOB              SAL       COMM
---------- --------- ---------- ----------
SMITH      CLERK           1020
ALLEN      SALESMAN        1930
WARD       SALESMAN        1580
JONES      MANAGER         3195
MARTIN     SALESMAN        1580
BLAKE      MANAGER         3180
CLARK      MANAGER         2172
SCOTT      ANALYST         3220
KING       PRESIDENT       4722
TURNER     SALESMAN        1830
ADAMS      CLERK           1320
JAMES      CLERK           1280
FORD       ANALYST         3220
MILLER     CLERK           1022

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325142717&siteId=291194637