How to solve the multi-threaded database duplicate insert, update problems

Basic concepts

Idempotent: An idempotent operation is characterized by its impact arising performed any number of times are performed with the same primary impact in programming.
In simple terms: so it is a power operation, no matter how many times to perform, and the effect of the returned results are the same.
Idempotent operations:
1 query: Query query several times and once, in the case of the same data, the query result is the same. select natural idempotent operations;

2. Delete: Delete operation is idempotent, delete, delete one or many times the data are deleted. (Note that the same may not return a result, there is no data to delete, return to 0, the plurality of deleted data, a plurality of return results);
3. Insert: Insert key unique case where the default master, so many times the same data is not a power insert etc.
4. update operations: here are two cases:

1、update t set money=100  where id=1
2、update t set money=money+100  where id=1

The first is idempotent, the second is not idempotent

Summary:
idempotent and you are not there JavaEE distributed high concurrency does not matter. The key is your operation is not idempotent.
To do idempotency, from interface design is not design any non-idempotent operation can be. For example, he said demand is: when the user clicks agree, the number of +1 endorse the answer. Read: When the user clicks agree, to ensure that answers agree there is a record, the user, the answer table. Endorsed by a number of figured out the answer to endorse table. In designing the system, it is the primary consideration, especially in a system like pay all the money involved in the treasure, banks, finance companies and other Internet, it is necessary to efficient data should be accurate, so can not appear more than chargeback, and more play money and other issues, it will difficult to handle, the user experience is not good.

Here you can refer to my other article:
power problems such as eight kinds of solutions to duplicate submission

Cause Analysis

Following this operation is very common:

if(用户不存在)
{
    xxxxx
    存储用户到数据库
}
else
{
    重复推送,不采取任何措施
}

This operation is not yet finished, the second thread has the same data has been entered and passed the test if the resulting database stores the same data two. This is a multi-threaded concurrent logic program led to the judgment of failure.

solution

Stand-alone mode, you can sync using simple, avoid duplication of discussion here insert a problem in a distributed scenario
here mainly referring to the next multi-threaded database solution, the rest of the program can be mentioned in the reference above:

Power and so on eight kinds of solutions to duplicate submission

Multi-threaded insert to solve:

1, insert when writing with a condition where (like index)
1.1, a single record is inserted

Common INSERT INTO inserted:
INSERT INTO card(cardno, cardnum) VALUES('1111', '100');

For ordinary INSERT insert, if you want to ensure not insert duplicate records, we only create a unique constraint on a field achieve (for example: cardno card number can not be repeated);
if you want to ensure that more fields will not be repeated, the United unique index can be considered!
After creating the index handles:

if (该cardno在数据库表中存在) {  
    update();  
} else {   
    try {  
         insert();  
         //违反唯一性约束会报异常:InvocationTargetException 
         } catch (InvocationTargetException e) {  
         //如果重复插入已经有数据,则进行更新
         update();  
     }   
}  

There is also a problem, that is, if the time tombstone table records using a unique index will appear BUG

Key here
that there is no unique constraint is not created only by programs INSERT INTO statement to achieve it?

The answer: Yes, INSERT INTO IF EXISTS specific syntax is as follows:
INSERT INTO table(field1, field2, fieldn) SELECT 'field1', 'field2', 'fieldn' FROM DUAL WHERE NOT EXISTS(SELECT field FROM table WHERE field = ?)
where DUAL is a temporary table, no physical creation, so you can use.

Transformation card for the example above as follows:
INSERT INTO card(cardno, cardnum) SELECT '111', '100' FROM DUAL WHERE NOT EXISTS(SELECT cardno FROM card WHERE cardno = '111')
1.2, to insert multiple records

INSERT INTO user  (id, no,add_time,remark)
select * from (
SELECT 1 id, 1 no, NOW() add_time,'1,2,3,1,2' remark FROM DUAL
UNION ALL
SELECT 1 no, 2 no, NOW() add_time,'1,2,3,1,2' remark FROM DUAL
UNION ALL
SELECT 1 no, 3 no, NOW() add_time,'1,2,3,1,2' remark FROM DUAL
) a where not exists (select no from user b where a.no = b.no)

The above is achieved no user table field is not repeated, insert three records.
In addition, attach mybatis batch write no field is not repeated realize statement.

INSERT INTO user (id, no,add_time,result)
select * from (
<foreach collection="list" item="obj" separator=" UNION ALL " >
SELECT #{obj.id} id, #{obj.no} no, #{obj.addTime} add_time,#{obj.result} result FROM DUAL
</foreach>
) a where not exists (select no from user b where a.no = b.no)

Multithreading update addresses

1, update method using optimistic locking version version of
the scenario:
 for example, two users to simultaneously buy a product, only one stock up! In practice database inventory level should be minus 2 operations, but due to the case of high concurrency, the first user to complete the purchase of the current inventory data read and decremented because this operation is not fully executed, so there will be commodity oversold!

select goods_num,version from goods where goods_name = "小本子";
update goods set goods_num = goods_num -1,version =查询的version值自增 where goods_name ="小本子" and version=查询出来的version;

Why add a version to meet the field, because the characteristics of the database to help itself, when the update statement is executed, if the update statement does not take the time to update the index table will be locked to ensure that only one thread at a time can enter updates, etc. the next update will perform the operation later this update release the lock, if the data submitted by the version number is greater than the current version number of the database table is to be updated, or that the data is outdated, so we can guarantee the security of the program.
This dependence on large data volume and high concurrent lower efficiency of the database hardware capability, for non-core businesses
2. Use select ... for update pessimistic lock
this lock and synchronized first check and then insert or update the same, but to avoid deadlock, efficiency is poor, it can not recommend the use of concurrent requests for a single

Time access to data lock acquisition: select * from table_xxx where id='xxx' for update;
Note: id field must be the primary key or unique index, or a lock table, will be dead; generally accompanied transaction pessimistic locking use used in conjunction with data lock time can be very long, selected according to the actual situation ;
this in large data volume and high concurrent efficiency depends on database hardware capability, for non-core business

Reference article:
https://www.cnblogs.com/ganhaiqiang-20130831/articles/4478472.html
https://www.cnblogs.com/lihuanliu/p/6764048.html

Idempotence solve:
https://www.cnblogs.com/baizhanshi/p/10449306.html
https://www.cnblogs.com/aspirant/p/11628654.html

发布了107 篇原创文章 · 获赞 14 · 访问量 4万+

Guess you like

Origin blog.csdn.net/belongtocode/article/details/103587176