Table of contents
1 Prepare test tables and data
Query the CK manual and find that even the best Mergetree for data consistency support is only 保证最终一致性
:
When we use table engines such as ReplacingMergeTree and SummingMergeTree, there will be temporary data inconsistencies.
In some scenarios that are very sensitive to consistency, there are usually the following solutions.
- create table
CREATE TABLE test_a(
user_id UInt64,
score String,
deleted UInt8 DEFAULT 0,
create_time DateTime DEFAULT toDateTime(0)
)ENGINE= ReplacingMergeTree(create_time)
ORDER BY user_id;
in:
- user_id is the identification of data deduplication update;
- create_time is the version number field, and the row with the largest create_time in each set of data represents the latest data;
- deleted is a custom flag bit, such as 0 means not deleted, 1 means deleted data.
- Write 10 million test data
INSERT INTO TABLE test_a(user_id,score)
WITH(
SELECT ['A','B','C','D','E','F','G']
)AS dict
SELECT number AS user_id, dict[number%7+1] FROM numbers(10000000);
- Modify the first 500,000 rows of data, including the name field and the create_time version number field
INSERT INTO TABLE test_a(user_id,score,create_time)
WITH(
SELECT ['AA','BB','CC','DD','EE','FF','GG']
)AS dict
SELECT number AS user_id, dict[number%7+1], now() AS create_time FROM
numbers(500000);
- total statistics
SELECT COUNT() FROM test_a;
10500000
12
Partition merging has not yet been triggered, so deduplication has not yet occurred.
2 Manual OPTIMIZE (not recommended)
After writing the data, execute OPTIMIZE immediately to force the merge action of the newly written partition.
OPTIMIZE TABLE test_a FINAL;
语法:OPTIMIZE TABLE [db.]name [ON CLUSTER cluster] [PARTITION partition |
PARTITION ID 'partition_id'] [FINAL] [DEDUPLICATE [BY expression]]
1234
3 Deduplication through Group by
- Execute deduplicated queries
SELECT
user_id ,
argMax(score, create_time) AS score,
argMax(deleted, create_time) AS deleted,
max(create_time) AS ctime
FROM test_a
GROUP BY user_id
HAVING deleted = 0;
Function description:
argMax(field1, field2): Take the field1 field value of the row where the maximum value of field2 is located
When we update the data, a new row of data will be written. For example, in the above statement, the modified score field value is obtained by querying the maximum create_time.
- Create views for easy testing
CREATE VIEW view_test_a AS
SELECT
user_id ,
argMax(score, create_time) AS score,
argMax(deleted, create_time) AS deleted,
max(create_time) AS ctime
FROM test_a
GROUP BY user_id
HAVING deleted = 0;
- Insert duplicate data, query again
#再次插入一条数据
INSERT INTO TABLE test_a(user_id,score,create_time)
VALUES(0,'AAAA',now())
#再次查询
SELECT *
FROM view_test_a
WHERE user_id = 0;
- delete data test
#再次插入一条标记为删除的数据
INSERT INTO TABLE test_a(user_id,score,deleted,create_time)
VALUES(0,'AAAA',1,now());
#再次查询,刚才那条数据看不到了
SELECT *
FROM view_test_a
WHERE user_id = 0;
This row of data is not actually deleted, but filtered out. In some suitable scenarios, the physical data can be finally deleted in combination with the table-level TTL.
4 Query via FINAL
Add the FINAL modifier after the query statement, so that the special logic of Merge (such as data deduplication, pre-aggregation, etc.) will be executed during the query process.
But this method was basically not used in the early version, because after adding FINAL, our query will become a single-threaded execution process, and the query speed is very slow.
In the v20.5.2.7-stable version, FINAL queries support multi-threaded execution, and the number of threads for a single query can be controlled by the max_final_threads parameter. But currently the action of reading part is still serial.
The final performance of the final query is related to many factors. The size of the column field, the number of partitions, etc. will affect the final query time, so it is necessary to choose according to the actual scene.
Reference link: https://github.com/ClickHouse/ClickHouse/pull/10463
Two versions of ClickHouse, 20.4.5.36 and 21.7.3.14, were installed for comparison.
4.1 Old version test
- common query
select * from visits_v1 WHERE StartDate = '2014-03-17' limit 100;
- FINAL query
select * from visits_v1 FINAL WHERE StartDate = '2014-03-17' limit 100;
Previously parallel queries became single-threaded.
4.2 New version testing
- common statement query
select * from visits_v1 WHERE StartDate = '2014-03-17' limit 100 settings max_threads = 2;
View the execution plan:
explain pipeline select * from visits_v1 WHERE StartDate = '2014-03-17' limit 100 settings max_threads = 2;
(Expression)
ExpressionTransform × 2
(SettingQuotaAndLimits)
(Limit)
Limit 2 → 2
(ReadFromMergeTree)
MergeTreeThread × 2 0 → 1
Obviously the part query will be read in parallel by 2 threads.
- FINAL query
select * from visits_v1 final WHERE StartDate = '2014-03-17' limit 100
settings max_final_threads = 2;
The query speed is not as fast as ordinary queries, but it has been improved compared to before. Check the execution plan of the FINAL query:
explain pipeline select * from visits_v1 final WHERE StartDate = '2014-03-17' limit 100 settings max_final_threads = 2;
(Expression)
ExpressionTransform × 2
(SettingQuotaAndLimits)
(Limit)
Limit 2 → 2
(ReadFromMergeTree)
ExpressionTransform × 2
CollapsingSortedTransform × 2
Copy 1 → 2
AddingSelector
ExpressionTransform
MergeTree 0 → 1
From the step of CollapsingSortedTransform, it is already multi-threaded, but the action of reading part is still serial.