What data protection solutions are available in high concurrency scenarios?

1. Demand Background

In the actual business development process, it is often necessary to ensure the uniqueness of data. For example, in the SAAS system we are working on, tenants have corresponding configuration data, and a tenant can only have one piece of data. It can be judged by the following two steps:

  1. First query whether this configuration data exists for this tenant?
  2. Insert if not present, ignore if present.

There is no problem in a scenario with low concurrency, but in a scenario with high concurrency, two threads find that the data does not exist at the same time, and then execute the logic of writing, which leads to data duplication.

2. Solutions

2.1 Distributed lock

Using Redis distributed locks, all requests are serialized. There are several specific implementation methods:

  • setNX code method
  • Redission
  • lock4j

When using this method, it should be noted that the granularity of the lock should not be too large.

2.2 Database implementation

2.2.1 Unique Index

Create a corresponding unique index for the table, for example: uk_tenant_id. This ensures that a tenant has only one piece of data. Applicable to businesses that do not require logical deletion .

2.2.2 insert ignore write

Use insert ... ignore, mysql finds that if the data is duplicated, it ignores it, otherwise it inserts it. Requires a unique index or PRIMARY KEY to exist on the table. It is also not applicable to tombstone business.

PS : Using insert ... ignoreit may also cause deadlock.

2.2.3 insert on duplicate key update write

Use insert on duplicate key update, if the data does not exist, insert directly. If it is found that the data already exists, update it. However, a unique index or PRIMARY KEY is required in the table.

PS : insert on duplicate key updateIn the case of high concurrency, deadlock problems may occur

2.2.4 Add an anti-heavy watch (business)

CREATE TABLE `config_unique` (
  `id` bigint(20) NOT NULL COMMENT 'id',
  `config_name` varchar(130) DEFAULT NULL COMMENT '名称',
  `config_key` varchar(255) NOT NULL COMMENT '键',
  `tenant_id` bigint(20) unsigned NOT NULL COMMENT '租户id',
  `user_name` varchar(30) NOT NULL COMMENT '创建用户名称',
  `create_date` datetime(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3) COMMENT '创建时间',
  PRIMARY KEY (`id`),
  UNIQUE KEY `uk_configName_configKey` (`config_name`,`config_key`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='配置防重表';

Note that there is no tombstone field deleted in the table. Before adding configuration data, add anti-heavy table first. If the addition is successful, it means that the configuration can be added normally; if the addition fails, it means that there is duplicate data. The operation of ensuring the anti-duplication table and adding configuration must be in the same transaction , otherwise problems will occur.

In code:

try {
    
    
  transactionTemplate.execute((status) -> {
    
    
      configUniqueMapper.insert(configUnique);
      configMapper.insert(config);
  return Boolean.TRUE;
  });
} catch(DuplicateKeyException e) {
    
    
   config = configMapper.query(config);
}

If the DuplicateKeyException is caught, it means that the data has been inserted.

3. Summary

According to the business scenario, you can choose the corresponding solution for data anti-duplication processing. There is no best solution, only a suitable solution.

Guess you like

Origin blog.csdn.net/oschina_41731918/article/details/127283452