有关多线程同步数据

近期由于同步一批数据，数据数量三百多万，一般的批量插入也太慢了，并且需要同时做更新和插入的操作，如果一般的做法先遍历匹配，在分别进行更新和插入，则效率太低，后通过大神指点和上网查资料，终于使用merge into 语法完成了同步，虽然还是太慢，但是，也总是学到了些新的东西，现在我把思路和部分代码记下来，主要是便于自己去记忆，当然也希望对大家有所帮助：

数据总数： totalCount
单页面数据大小： numPage
总页数：totalPage ------- (int)Math.ceil((totalCount/numPage*1.0));
已插入数据库的页数： beenHandledPage ----初始化为0
设置一个序列号 String uuid = UUID.randomUUID().toString();
1，设置线程总数为 threadNum ---- 5

2，设置单线程处理的页面数
int pageNumOfPerThread = (int)Math.ceil((totalPage-beenHandledPage)/(threadNum*1.0));

3、遍历每个线程
设置开始页数：int stattPage = beenHandledPage + i*pageNumOfPerThread +1;

       设置结束页数：int endPage = beenHandledPage + (i+1)*pageNumOfPerThread;

       创建线程处理：MarketingAsstsTransitionThread thread = new MarketingAsstsTransitionThread(startPage,
       endPage<totalPage?endPage：totalPage,numPage,
                                                           maxTs,totalCount,uuid,jdbcTemplate,
                                                           marketingAssetsMapper, amDataSyncLogService);
       （//注释 jdbcTemplate, marketingAssetsMapper, amDataSyncLogService 处理业务逻辑需要的些接口名，jdbcTemplate是通过配置文件可以实现连接远程数据库，并编写sql语句

maxTs 是本地测试库中所有数据中时间值最大的一个时间值）

       开始启动线程： thread.start();

4,下面是具体的线程类的内容
public class MarketingAssetsTransitionThread extends Thread{

   //实例化参数
   private int startPage;
   private int endPage;
   private int numPage;
   private String maxTs;
   private Integer count;
   private String uuid;
   private JdbcTemplate jdbcTemplate;
   private MarketingAssetsMapper marketingAssetsMapper;
   private AmDataSyncLogService amDataSyncLogService;

   /**
   * 实例化线程及注入bean
   */
   public MarketingAssetsTransitionThread(int startPage, int endPage, int numPage, String maxTs,
   Integer count, String uuid, JdbcTemplate jdbcTemplate,
   MarketingAssetsMapper marketingAssetsMapper,
                                       AmDataSyncLogService amDataSyncLogService) {
       super();
       this.startPage = startPage;
       this.endPage = endPage;
       this.numPage = numPage;
       this.maxTs = maxTs;
       this.count = count;
       this.uuid = uuid;
       this.jdbcTemplate = jdbcTemplate;
       this.marketingAssetsMapper = marketingAssetsMapper;
       this.amDataSyncLogService = amDataSyncLogService;
   }

   public void run(){

   //具体的业务逻辑
   /**
   *从远程数据库中与本地数据库中最大的一个时间值进行匹对，并取出这些数据的集合，遍历该集合并于本地数据
   *库的所有数据进行匹配（通过唯一标识）如果在本地数据库中存在与该数据
   *唯一标识相同的数据则对本地数据库的与之匹配的数据进行更新，如果不存在则将该数据添加到本地数据库中
   */

   /**
   *在上面的业务处理中，通过条件从远程数据库与本地数据库匹配并同时进行更新和插入操作的时候，使用的
   *是（Oracle 9i引入的）提供的merge关键字，merge into 语法，进行同时的更新和插入操作
   *（merge语法是根据源表对目标表进行匹配查询，匹配成功时更新，不成功时插入）
   *
   */
   }

5，一部分的mapper文件内容：
<insert id="mergeMarketingAssetsList" parameterType="java.util.List">
       MERGE INTO MARKETING_ASSETS M
       USING
       (
       <foreach collection="list" item="item" index="index" separator="UNION">
       SELECT
               #{item.areaname ,jdbcType=VARCHAR} AREANAME,
               #{item.areaid ,jdbcType=VARCHAR} AREAID,
               #{item.yname ,jdbcType=VARCHAR} YNAME,
               #{item.bldname ,jdbcType=VARCHAR} BLDNAME,
               #{item.roomid ,jdbcType=VARCHAR} ROOMID,
               #{item.bldid ,jdbcType=VARCHAR} BLDID,
               #{item.unit ,jdbcType=VARCHAR} UNIT,
               #{item.stair ,jdbcType=VARCHAR} STAIR,
               #{item.jgstair ,jdbcType=VARCHAR} JGSTAIR,
               #{item.roomno ,jdbcType=VARCHAR} ROOMNO
   FROM DUAL
       </foreach>
       ) T
       ON (M.ROOMID = T.ROOMID and M.IS_DEL = T.IS_DEL)
       WHEN MATCHED THEN
       UPDATE SET
           M.AREANAME=T.AREANAME,
           M.AREAID=T.AREAID,
           M.YNAME=T.YNAME,
           M.BLDNAME=T.BLDNAME,
           M.BLDID=T.BLDID,
           M.UNIT=T.UNIT,
           M.STAIR=T.STAIR,
           M.JGSTAIR=T.JGSTAIR,
           M.ROOMNO=T.ROOMNO
       WHEN NOT MATCHED THEN
       INSERT (
           AREANAME,
           AREAID,
           YNAME,
           BLDNAME,
           ROOMID,
           BLDID,
           UNIT,
           STAIR,
           JGSTAIR,
           ROOMNO
           ) VALUES (
           T.AREANAME,
           T.AREAID,
           T.YNAME,
           T.BLDNAME,
           T.ROOMID,
           T.BLDID,
           T.UNIT,
           T.STAIR,
           T.JGSTAIR,
           T.ROOMNO
       )
   </insert>

   ============================================
下面是具体的一些merge into 用法：

其基本语法规则是

merge into 目标表 a

using 源表 b

on(a.条件字段1=b.条件字段1 and a.条件字段2=b.条件字段2 ……)

when matched then update set a.更新字段=b.字段

when not macthed then insert into a(字段1,字段2……)values(值1,值2……)

变种写法①，只更新：

merge into 目标表 a

using 源表 b

on(a.条件字段1=b.条件字段1 and a.条件字段2=b.条件字段2 ……)

when matched then update set a.更新字段=b.字段,a.更新字段2=b.字段2……

变种写法②，只插入：

merge into 目标表 a

using 源表 b

on(a.条件字段1=b.条件字段1 and a.条件字段2=b.条件字段2 ……)

when not macthed then insert into a(字段1,字段2……)values(值1,值2……)

注：条件字段不可更新

对于Oracle来说，merge是9i新增的语法，在10g进行了一些增强，如下：

测试环境：Oracle Database 11g Enterprise Edition Release 11.2.0.1.0

①条件操作：

merge into 目标表 a

using 源表 b

on(a.条件字段1=b.条件字段1 and a.条件字段2=b.条件字段2 ……)

when matched then update set a.更新字段=b.字段 where 限制条件

when not macthed then insert into a(字段1,字段2……)values(值1,值2……) where 限制条件

举例：

merge into test_merge a
using test b
on(a.no=b.no)
when matched then update set a.no2=b.no2 where a.no<>1
when not matched then insert values(b.no,b.no2) where a.no<>100

当然也支持变种①②的写法

②删除操作

merge into 目标表 a

using 源表 b

on(a.条件字段1=b.条件字段1 and a.条件字段2=b.条件字段2 ……)

when matched then update set a.更新字段=b.字段

delete where b.字段=xxx

举例：

merge into test_merge a
using test b
on(a.no=b.no)
when matched then update set a.no2=b.no2 where a.no<>1
delete
where b.no=14

备注：删除动作针对的也是目标表，并且必须在语句最后

基本上merge的用法就是以上这些，建议平常可以多用，
比单独的update+insert的方式效率要更高，尤其是on条件下有唯一索引的时候，效率更高

有关多线程同步数据

猜你喜欢