Java reads a large amount of data from oracle and writes it out to a file

Java reads a large amount of data from oracle and writes it out to a file

Background : The project needs to read table fields from Oracle and splice the fields with specified spacers, and output to txt. The data volume of each table is about 2000W, because you only need to read all the data of a specified field in a table, and you don't need to consider query optimization, only optimize the table read scale.

Version oracle 11G

Idea 1:

Use the oracle statement to page the data table. What should be noted here is the efficiency of rowid and rownum, and the order by is not used


SELECT t.*
  FROM a t, (SELECT ROWNUM rn, c.*
                  FROM (SELECT   ROWID k
                            FROM a
                        ORDER BY ID) c) b
WHERE t.ROWID = b.k AND b.rn BETWEEN 10001 AND 20000;

There is a comparative article, you can take a look at http://www.itpub.net/thread-1603830-1-1.html

This idea is also not suitable for my needs. Paging reading does shorten the query time, but for tens of millions of large tables, the later the paging takes longer, the overall efficiency improvement is not much. .

Idea 2:

When writing data, I use multiple threads to improve writing efficiency. Tests have proved that my efficiency bottleneck is in reading data, not writing data.

Idea three:

Use ResultSet to read the result set in batches. At the beginning, I didn't expect that ResultSet directly supports batch reading. It took a lot of time on the first two ideas.

First explain
the common usage of ResultSet before pasting the code : click to view

Most articles explain common methods and parameter settings. The setFetchSize() and setMaxRows() of ResultSet are lacking and few people mention

简单来说
setFetchSize()  :是设置ResultSet每次向数据库取的行数 

例如:rs.setFetchSize(100),ResultSet每次向数据库读取100条数据,
之后下一百条数据的读取是在ResultSet内部完成的,不需要手动去进行调用或定位数据从哪行开始。

setMaxRows() :是设置Resultset最多返回的行数,不需要读取全部数据,只要特定行数的数据,可以选择此方法。

The analysis in an article is not
bad : JDBC read data optimization setFetchSize
JDBC read data optimization-fetch size

In addition, if set

stmt = destCon.createStatement(ResultSet.TYPE_SCROLL_SENSITIVE, ResultSet.CONCUR_READ_ONLY);

May report this error
Insert picture description here

这时需要把 
ResultSet.TYPE_SCROLL_SENSITIVE
改为
ResultSet.TYPE_SCROLL_INSENSITIVE,

Paste my code below


public static void main(String[] args) throws SQLException, ClassNotFoundException, IOException {
    
    

        String selsql;
        Connection destCon = null;
        Statement stmt = null;
        BufferedWriter output = null;
        long rowCount = 0L;
        int colCounts = 0;
        ResultSet res = null;

        long flen = 0L;
        selsql = "select RANDOM_STRING from myTestTable";
        destCon = getConnection();
        int fileCount = 1;
        EtlRuler etlRuler = new EtlRuler();
        etlRuler.setLocal_path("E:\\web_project\\");
        etlRuler.setFile_name("test010.txt");
        String filePath = etlRuler.getLocal_path() + etlRuler.getFile_name().replace("${NUM}", "00" + fileCount);
        etlRuler.getDataPath().add(filePath);


        File file = new File(etlRuler.getDataPath().get(fileCount - 1));
        if (!file.exists()) {
    
    
            file.createNewFile();
        }
        StringBuilder line = new StringBuilder();
        output = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(file, false), StandardCharsets.UTF_8));

        stmt = destCon.createStatement(ResultSet.TYPE_SCROLL_SENSITIVE, ResultSet.CONCUR_READ_ONLY);
        res = stmt.executeQuery(selsql);
        res.setFetchSize(2000);



        if (res == null || !res.next()) {
    
    
            System.out.println("数据文件sql无数据!");
            // throw new Exception("数据文件sql无数据!");
        }
        assert res != null;
        res.previous();

        // 获取字段元信息
        ResultSetMetaData rsmd1 = res.getMetaData();
        colCounts = rsmd1.getColumnCount();
        int j = 0;
        String str = "";
        while (res.next()) {
    
    
            //System.out.println("开始读取数据" + rowCount++);
            // 打印进度
            rowCount++;
            if (rowCount % 2000 == 0) {
    
    
                Date date = new Date();
                //20w条数据的时候写入一下,之后清空StringBuilder,重新添加数据
                System.out.println("执行时间:" + date);
                System.out.println(rowCount + " ----rows proceed");
//                output.write(line.toString());
//                output.flush();
//                line.delete(0, line.length());
            }

            for (int i = 1; i <= colCounts; i++) {
    
    
                //line.append(res.getString(i)).append("\n");
                str = res.getString(i)+"\n";
            }
            output.write(str);
            output.flush();
            str = "";

            //System.out.println("开始写入数据");


            if (file.length() > (1024 * 500)) {
    
    
                if (etlRuler.getFile_name().contains("${NUM}")) {
    
    
                    //output.write(line.toString());
                    //output.flush();
                    //line.delete(0,line.length());
                    fileCount++;
                    System.out.println("创建新文件");
                    String newfilePath = etlRuler.getLocal_path() + etlRuler.getFile_name().replace("${NUM}", "00" + fileCount);
                    etlRuler.getDataPath().add(newfilePath);
                    file = new File(etlRuler.getDataPath().get(fileCount - 1));
                    if (!file.exists()) {
    
    
                        file.createNewFile();
                    }
                    output = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(file, false),
                            StandardCharsets.UTF_8));
                }
            }
        }

        //System.out.println("开始写入数据");
        output.write(line.toString());
        output.flush();
        output.close();
        flen = file.length();
        System.out.println("文件大小:" + flen);
    }

public static Connection getConnection() throws ClassNotFoundException, SQLException {
    
    
        Class.forName("oracle.jdbc.driver.OracleDriver");
        Connection con = null;
        con = DriverManager.getConnection(
                "jdbc:oracle:thin:@" + ip + ":" + port + ":" + sid, user,
                password);  //设置数据库连接字符串
        return con;
    }

Guess you like

Origin blog.csdn.net/zhuyin6553/article/details/108593447
Recommended