Insert large amounts of data into mysql java version

1. Background:

 Recently, bloggers are studying the performance of MySQL in various aspects of SQL execution under large data volumes, as well as execution speed, optimization methods and optimized results. At this time, a large number of tables is the necessity I face.

2.Solution

        The first thing I can think of here is multi-threading + batch insertion to solve this problem. Then I referred to many articles. The general solution is the same, but the methods are slightly different. Some use native ones, and some use mybatis. , some use stored procedures

        Based on the simple approach, my solution is as follows

 public void add1000W() {
        try {
            //获取开始时间
            long start = DateUtil.currentSeconds();
            // 指定线程数量
            int numThreads = 10;
            // 指定每个线程批量插入的数据量
            int batchSize = 1000;
            // 指定要插入的总记录数
            int totalRecords = 10000000;
            //每个线程所负责的数据量
            int count = totalRecords / numThreads;
            //创建线程池
            Thread[] threads = new Thread[numThreads];
            //创建启动线程
            for (int i = 0; i < numThreads; i++) {
                threads[i] = new InsertThread(testService,batchSize,0,count);
                threads[i].start();
            }
            // 等待所有线程执行完毕
            for (int i = 0; i < numThreads; i++) {
                threads[i].join();
            }
            long end = DateUtil.currentSeconds();
            System.out.println(totalRecords + "条数据插入完毕,用时===>" + (end - start) + "秒");
        }catch (Exception e){
            e.printStackTrace();
        }
    }
    
    static class InsertThread extends Thread{
        //定义私有变量,用于保存临时操作内容
        private TestService service;
        private int batchSize;
        private int startIndex;
        private int count;

        public InsertThread() {
        }
        //创建一个有参构造,用于赋值


        public InsertThread(TestService testService,int batchSize, int startIndex, int count) {
            this.batchSize = batchSize;
            this.startIndex = startIndex;
            this.count = count;
            this.service = testService;
        }

        @Override
        public void run() {
            try {
                List<TestModel> list = new ArrayList<>();
                TestModel testModel = new TestModel();
                for (int i = startIndex; i < count; i++) {
                    testModel.setId(IdUtil.fastSimpleUUID());
                    testModel.setSex(i % 2 == 0 ? 0 : 1);
                    testModel.setName(i % 2 == 0 ? RandomInfo.getRandomBoyName() : RandomInfo.getRandomGirlName());
                    testModel.setAge(RandomUtil.randomNumber(100));
                    testModel.setAddress(RandomInfo.getRandomAddress());
                    list.add(testModel);
                    if (i % batchSize == 0){
                        service.saveBatch(list);
                        list.clear();
                        System.out.println(batchSize + "条数据插入完成");
                    }
                }
                if (list.size() != 0){
                    service.saveBatch(list);
                    list.clear();
                }
            }catch (Exception e){
                e.printStackTrace();
            }
        }
    }

3. Optimization plan

        1. Environment

                The cloud server I used during the test has a maximum bandwidth of 1Mbit/s.

        2.Test results

         At that time, it took about 12 hours to complete the insertion. Since this was a personal need and there was no time requirement, I did not go into too much detail about this time issue.

        3. Personal analysis of factors affecting performance 

  1. Frequent creation of objects
  2. number of threads
  3. Amount of data to insert
  4. For the assignment of parameters in the object, in order to be closer to the real online environment, the data created are generated in real time through tool classes, which is also the main factor affecting efficiency.
  5. Computer configuration
  6. Internet speed
  7. Framing problem
  8. Database configuration issues
  9. Waiting for many factors, capable experts can continue to explore in depth

        4.Solution

  1. Reduce the creation of objects, simply do not use objects directly, and use the method of splicing SQL
  2. How many more threads should be opened depends on the performance of the computer. If the performance cannot keep up, opening too many threads will not make much sense.
  3. Reasonably reduce the amount of data inserted
  4. No need to use cumbersome tool classes to create parameters, everything is done in a simple way
  5. Upgrade computer configuration
  6. increase bandwidth
  7. Avoid using frameworks. No matter what you are doing, the closer the technology you use to the bottom layer, the faster it will be.
  8. You can adjust the configuration in the database configuration file if reasonable.
  9. We look forward to more solutions from you guys.

Guess you like

Origin blog.csdn.net/zjb1697922408/article/details/131823723