Due to the needs of the project, 60 million pieces of data need to be imported into the database. Existing resources are txt text (data separated by ",").
Option 1: Convert it to an insert statement, but the execution efficiency is too low, so give up.
Option 2: Use the plsql tool (tool-text import), the efficiency is about 500,000 per hour. Not because of the urgency of time.
Option 3: Using sqlldr, this efficiency is not bad, about 3.5 million per hour. The following highlights.
tool: sqlloader
Prerequisite: Install oracle server
Environment: 60 million data, i5cpu , 4g memory, oracle10g
step:
The data during execution is as follows:
1. Write a ctl file ( risk.ctl in this example ):
OPTIONS(skip_index_maintenance=TRUE,direct=true,BINDSIZE=20971520,READSIZE=20971520,ERRORS=-1,ROWS=500000)
unrecoverable
load data
CHARACTERSET AL32UTF8
infile 'C:\Users\LHB\Desktop\TaxmodelData\PureRiskPremiumItem.txt'
insert into table pureriskpremium
Fields terminated by ','
trailing nullcols
(MODELCODE,
BASICRATECODE,
AREACODE,
BASICRISKPREMIUMS,
BIZVERSION,
EFFECTIVETIME,
OBTAINCREATETIME
);
explain:
ROWS=500000 every 500000 commits
C:\Users\LHB\Desktop\TaxmodelData\PureRiskPremiumItem.txt' needs to deal with the real
Fields terminated by ',' separator, in this case the separator is ","
pureriskpremium table name
2. Open cmd
sqlldr jyrluser/[email protected]:1521/xasccxdb1 control=C:\Users\LHB\Desktop\TaxmodelData\risk.ctl
log=c:\jrimp.log
Explanation: sqlldr username / password @ real ip : port number / instance name control= control file address log= log (log only when it fails)
3. After pressing Enter, if no error is reported, it means that it is being executed. Each time a certain number of executions will be prompted.
4. Debug the error message:
SQL*Loader-2026:加载因SQL加载程序无法继承而被终止
SQL*Loader-925:uldlfca:OCIStmtExecute(ptc_hp)时出错
ORA-03114:未连接到ORACLE
当时执行到3000左右时,报错了,然后查询报错日志,从日志中看出,字段定义长度太短,导致报了3000万条错误,后来扩充了字段。再次执行,就好用了。
C:\Users\Administrator>sqlldr xinjiatao/[email protected]:1521/ORCL control=F:\New_Data\A\test.ctl log=c:\xin.log data=F:\New_Data\A\score_20170715.csv