ClickHouse data consistency


1 Prepare test tables and data

insert image description here

Query the CK manual and find that even the best Mergetree for data consistency support is only 保证最终一致性:
insert image description here

When we use table engines such as ReplacingMergeTree and SummingMergeTree, there will be temporary data inconsistencies.
  In some scenarios that are very sensitive to consistency, there are usually the following solutions.

  1. create table
   CREATE TABLE test_a(
   	user_id UInt64,
   	score String,
   	deleted UInt8 DEFAULT 0,
   	create_time DateTime DEFAULT toDateTime(0)
   )ENGINE= ReplacingMergeTree(create_time)
   ORDER BY user_id;

in:

  • user_id is the identification of data deduplication update;
  • create_time is the version number field, and the row with the largest create_time in each set of data represents the latest data;
  • deleted is a custom flag bit, such as 0 means not deleted, 1 means deleted data.
  1. Write 10 million test data
   INSERT INTO TABLE test_a(user_id,score)
   WITH(
   	SELECT ['A','B','C','D','E','F','G']
   )AS dict
   SELECT number AS user_id, dict[number%7+1] FROM numbers(10000000);
  1. Modify the first 500,000 rows of data, including the name field and the create_time version number field
   INSERT INTO TABLE test_a(user_id,score,create_time)
   WITH(
   SELECT ['AA','BB','CC','DD','EE','FF','GG']
   )AS dict
   SELECT number AS user_id, dict[number%7+1], now() AS create_time FROM 
   numbers(500000);

  1. total statistics
   SELECT COUNT() FROM test_a;
   10500000
   12

Partition merging has not yet been triggered, so deduplication has not yet occurred.

2 Manual OPTIMIZE (not recommended)

After writing the data, execute OPTIMIZE immediately to force the merge action of the newly written partition.

OPTIMIZE TABLE test_a FINAL;

语法:OPTIMIZE TABLE [db.]name [ON CLUSTER cluster] [PARTITION partition | 
PARTITION ID 'partition_id'] [FINAL] [DEDUPLICATE [BY expression]]
1234

3 Deduplication through Group by

  1. Execute deduplicated queries
   SELECT
   user_id ,
   argMax(score, create_time) AS score, 
   argMax(deleted, create_time) AS deleted,
   max(create_time) AS ctime 
   FROM test_a 
   GROUP BY user_id
   HAVING deleted = 0;

Function description:

argMax(field1, field2): Take the field1 field value of the row where the maximum value of field2 is located

When we update the data, a new row of data will be written. For example, in the above statement, the modified score field value is obtained by querying the maximum create_time.

  1. Create views for easy testing
   CREATE VIEW view_test_a AS
   SELECT
   user_id ,
   argMax(score, create_time) AS score, 
   argMax(deleted, create_time) AS deleted,
   max(create_time) AS ctime 
   FROM test_a 
   GROUP BY user_id
   HAVING deleted = 0;
  1. Insert duplicate data, query again
   #再次插入一条数据
   INSERT INTO TABLE test_a(user_id,score,create_time)
   VALUES(0,'AAAA',now())
   #再次查询
   SELECT *
   FROM view_test_a
   WHERE user_id = 0;
  1. delete data test
   #再次插入一条标记为删除的数据
   INSERT INTO TABLE test_a(user_id,score,deleted,create_time) 
   VALUES(0,'AAAA',1,now());
   
   #再次查询,刚才那条数据看不到了
   SELECT *
   FROM view_test_a
   WHERE user_id = 0;

This row of data is not actually deleted, but filtered out. In some suitable scenarios, the physical data can be finally deleted in combination with the table-level TTL.

4 Query via FINAL

Add the FINAL modifier after the query statement, so that the special logic of Merge (such as data deduplication, pre-aggregation, etc.) will be executed during the query process.
  But this method was basically not used in the early version, because after adding FINAL, our query will become a single-threaded execution process, and the query speed is very slow.
  In the v20.5.2.7-stable version, FINAL queries support multi-threaded execution, and the number of threads for a single query can be controlled by the max_final_threads parameter. But currently the action of reading part is still serial.
  The final performance of the final query is related to many factors. The size of the column field, the number of partitions, etc. will affect the final query time, so it is necessary to choose according to the actual scene.

Reference link: https://github.com/ClickHouse/ClickHouse/pull/10463
Two versions of ClickHouse, 20.4.5.36 and 21.7.3.14, were installed for comparison.

4.1 Old version test

  1. common query
   select * from visits_v1 WHERE StartDate = '2014-03-17' limit 100;

  1. FINAL query
   select * from visits_v1 FINAL WHERE StartDate = '2014-03-17' limit 100;

Previously parallel queries became single-threaded.

4.2 New version testing

  1. common statement query
   select * from visits_v1 WHERE StartDate = '2014-03-17' limit 100 settings max_threads = 2;

View the execution plan:

   explain pipeline select * from visits_v1 WHERE StartDate = '2014-03-17' limit 100 settings max_threads = 2;
   
   (Expression) 
   	ExpressionTransform × 2
   	(SettingQuotaAndLimits) 
   		(Limit) 
   		Limit 22
   			(ReadFromMergeTree) 
   			MergeTreeThread × 2 01

Obviously the part query will be read in parallel by 2 threads.

  1. FINAL query
   select * from visits_v1 final WHERE StartDate = '2014-03-17' limit 100 
   settings max_final_threads = 2;
  

The query speed is not as fast as ordinary queries, but it has been improved compared to before. Check the execution plan of the FINAL query:

   explain pipeline select * from visits_v1 final WHERE StartDate = '2014-03-17' limit 100 settings max_final_threads = 2;
   
   (Expression) 
   ExpressionTransform × 2 
   (SettingQuotaAndLimits) 
   	(Limit) 
   	Limit 22 
   		(ReadFromMergeTree) 
   		ExpressionTransform × 2 
   			CollapsingSortedTransform × 2
   				Copy 12 
   					AddingSelector 
   						ExpressionTransform
   							MergeTree 01 

From the step of CollapsingSortedTransform, it is already multi-threaded, but the action of reading part is still serial.

Guess you like

Origin blog.csdn.net/ZGL_cyy/article/details/130302273