CrateDB初探(四):乐观并发控制 (Optimistic Concurrency Control )

目录

系统列(system columns)

 _seq_no

_primary_term

示例

乐观更新/删除


 这个系列的其他文章:

CrateDB初探(一):CrateDB集群的Docker部署

CrateDB初探(二):PARTITION, SHARDING AND REPLICATION

CrateDB初探(三):JDBC


在乐观并发控制中,用户读取数据时不锁定数据。当一个用户更新数据时,系统将进行检查,查看该用户读取数据后其他用户是否又更改了该数据。如果其他用户更新了数据,将产生一个错误。一般情况下,收到错误信息的用户将回滚事务并重新开始。这种方法之所以称为乐观并发控制,是由于它主要在以下环境中使用:数据争用不大且偶尔回滚事务的成本低于读取数据时锁定数据的成本。

即使CrateDB不支持事务,也可以通过_seq_no和_primary_term这两个系统列实现乐观并发控制(Optimistic Concurrency Control)。本文对CrateDB的乐观并发控制做个粗浅的介绍,详见官方文档

系统列(system columns)

_seq_no和_primary_term是表的系统列之一,他们反映数据操作的顺序和集群的配置变化,通过select可以观察他们在每次数据操作之后的变化。

 _seq_no

当表中一行数据被修改、新插入或者被删除后,primary shards会修改该行的_seq_no(相当于行的版本号)。类似的系统列还有_version,但是对于乐观并发控制_version已经不建议使用了。

The CrateDB primary shards will increment a sequence number for every insert, update and delete operation executed against a row. The current sequence number of a row is exposed under this column. This column can be used in conjunction with the _primary_term column for Optimistic Concurrency Control, see Optimistic Concurrency Control for usage details.

_primary_term

反映集群配置发生变化

The sequence numbers give us an order of operations that happen at a primary shard, but they don't help us distinguish between old and new primaries. For example, if a primary is isolated in a minority partition, a possible up to date replica shard on the majority partition will be promoted to be the new primary shard and continue to process write operations, subject to the write.wait_for_active_shards setting. When this partition heals we need a reliable way to know that the operations that come from the other shard are from an old primary and, equally, the operations that we send to the shard re-joining the cluster are from the newer primary. The cluster needs to have a consensus on which shards are the current serving primaries. In order to achieve this we use the primary terms which are generational counters that are incremented when a primary is promoted. Used in conjunction with _seq_no we can obtain a total order of operations across shards and Optimistic Concurrency Control.

示例

初始状态

更新id=1的行以后,该行的_seq_no+1, _primary_term: 1->6(_primary_term的变化猜测是因为从初始状态到本次update,中途因某些原因重启过集群)

乐观更新/删除

在更新或者删除前查询获得_seq_no和_primary_term的值,在执行时校验_seq_no/_primary_term,若被修改就不执行。这样避免应用的数据冲突不会导致数据丢失。

Querying for the correct _seq_no and _primary_term ensures that no concurrent update and cluster configuration change has taken place

具体做法是更新或者删除时在指定_seq_no和_primary_term

update parted_table set width = 200 where id = 1 and _seq_no = 6 and _primary_term = 6;

但是,如果WHERE条件中没有包含主键会报错(这里测试的这个表没有设置主键,也会报错)

为了实验,新建一张表(设置联合主键id,day),并插入数据

CREATE TABLE parted_table2 (
           id bigint PRIMARY KEY,
           title text,
           content text,
           width double precision,
           day timestamp with time zone PRIMARY KEY
         ) PARTITIONED BY (day);
insert into parted_tables (select * from parted_table);

WHERE条件需包含(所有)主键

总结下:

1. 需要指定(所有)主键

The _seq_no and _primary_term columns can only be used when specifying the whole primary key in a query. 

2. _seq_no and _primary_term 需要同时指定

In order to use the optimistic concurrency control mechanism both the _seq_no and _primary_term columns need to be specified

猜你喜欢

转载自blog.csdn.net/gxf1027/article/details/104948354