ClickHouse’s bombing function is here, and its performance has been increased by 40 times|Giveaway book at the end of the article

Everyone, the most exciting function of ClickHouse this year is here. Yes, it is the long-awaited Projection function. ClickHouse's current functions are very rich and powerful, but the community tells us with reality that it can do better :)

I don't know if you have encountered any of these situations:

  • MergeTree only supports one collation

When building a table, Order By also determines the primary key sparse index and data sorting, assuming:

Order BY A,B,C

Then usually filter query Where A will be fast, but Where C will be slower.

  • Materialized views are not smart enough

For fixed query topics, we will build many materialized views based on a bottom table to help further improve query performance, increase QPS, and reduce resource overhead.

Materialized views are effective, but not smart enough. A materialized view is essentially an independent table that writes data to the view table in real time through the triggers of the original table.

Since the materialized view is also an independent table, there will naturally be a problem of data consistency with the original table. If there are many materialized views, maintenance is also a problem.

The emergence of the Projection function perfectly solves the above problems . The concept of Projection comes from the paper "C-Store: A Column-oriented DBMS", authored by Mike Stonebraker, winner of the 2015 Turing Award and father of Vertica.

Projection refers to the combination of a set of columns, which can be stored in a different order from the original table, and supports queries of aggregate functions.

Amos Bird (Zheng Tianqi) from Kuaishou borrowed this idea, implemented the Projection function in ClickHouse, and contributed to the community.

ClickHouse Projection can be regarded as a more intelligent materialized view, which has the following characteristics:

  • part-level storage

    Compared with the ordinary materialized view, which is an independent table, the materialized data of the projection is stored in the partition directory of the original table, and supports ordinary projection and pre-aggregated projection of detailed data

  • Feel free to use, automatic hit

    Multiple Projections can be created for one MergeTree. When executing the Select statement, it can automatically match the optimal Projection according to the query range to provide query acceleration. If the Projection is not hit, the bottom table is directly queried.

    

  • Data from the same source, live and die together

        Because the materialized data is stored in the partition of the original table, the update and merge of data are all from the same source, so there will be no inconsistency

It may be quite abstract to talk about it this way. Let’s just look at the use cases. Here we directly use the official test data set  hits_100m_obfuscated. This table has 100 million data:

SELECT count(*)FROM hits_100m_obfuscatedQuery id: 813ba930-d299-47d8-9ac3-6d7dbde075b1┌───count()─┐│ 100000000 │└───────────┘1 rows in set. Elapsed: 0.004 sec.

Order By is:

ENGINE = MergeTreePARTITION BY toYYYYMM(EventDate)ORDER BY (CounterID, EventDate, intHash32(UserID), EventTime)

When there is no  Projection  , query the non-primary key WatchID:

SELECT WatchIDFROM hits_100m_obfuscatedWHERE WatchID = 5814563137538961516Query id: 20110b52-cac0-43b7-baf6-1931b94864a6┌─────────────WatchID─┐│ 5814563137538961516 │└─────────────────────┘1 rows in set. Elapsed: 0.262 sec. Processed 100.00 million rows, 800.00 MB (380.95 million rows/s., 3.05 GB/s.)

As a result, 800MB and 100 million rows of data were scanned in the whole table.

Now create a  Projection  to accelerate the specific Where field, and generate another sorting rule different from the primary key according to the query requirements:

ALTER TABLE hits_100m_obfuscated ADD PROJECTION p1(     SELECT       WatchID,Title    ORDER BY WatchID

Note that only the data written after the PROJECTION is created will be automatically materialized.

For historical data, materialization needs to be triggered manually, for example, now we need to execute:

alter table hits_100m_obfuscated MATERIALIZE PROJECTION p1

MATERIALIZE PROJECTION is an asynchronous mutation operation, and the status can be queried by the following statement:

SELECT    table,    mutation_id,    command,    is_doneFROM system.mutations AS mWHERE is_done = 0Query id: 7ddc855a-acb5-4ca9-8c48-ad4f5a7b234e┌─table────────────────┬─mutation_id─────┬─command───────────────────┬─is_done─┐│ hits_100m_obfuscated │ mutation_99.txt │ MATERIALIZE PROJECTION p1 │       0 │└──────────────────────┴─────────────────┴───────────────────────────┴─────────┘1 rows in set. Elapsed: 0.005 sec.

At this time, if we go to the partition directory, you will see a tmp temporary partition, which is materializing the data of PROJECTION:

After the p1 PROJECTION is generated, let's look at the partition directory:

You will see that under the original MergeTree partition, there is an additional p1.proj subdirectory, enter the subdirectory, and you will find that the storage format of MergeTree is the same:

cd /data/default/hits_100m_obfuscated/201307_1_96_4_107/p1.proj[root@ch9 p1.proj]# lltotal 5187772-rw-r-----. 1 clickhouse clickhouse        278 Sep  8 23:43 checksums.txt-rw-r-----. 1 clickhouse clickhouse         69 Sep  8 23:43 columns.txt-rw-r-----. 1 clickhouse clickhouse          9 Sep  8 23:43 count.txt-rw-r-----. 1 clickhouse clickhouse         10 Sep  8 23:43 default_compression_codec.txt-rw-r-----. 1 clickhouse clickhouse      97672 Sep  8 23:43 primary.idx-rw-r-----. 1 clickhouse clickhouse 4508224709 Sep  8 23:43 Title.bin-rw-r-----. 1 clickhouse clickhouse     293016 Sep  8 23:43 Title.mrk2-rw-r-----. 1 clickhouse clickhouse  803340103 Sep  8 23:43 WatchID.bin-rw-r-----. 1 clickhouse clickhouse     293016 Sep  8 23:43 WatchID.mrk2

When the query hits a certain PROJECTION, the data in the subdirectory of the partition will be directly used to provide the query.

After having p1 PROJECTION, execute the same query again, remember to set the parameters to enable this function first:

SET allow_experimental_projection_optimization = 1;

Execute query:

SELECT WatchIDFROM hits_100m_obfuscatedWHERE WatchID = 5814563137538961516Query id: 38d2aa48-45da-4487-ab80-1cd02ee08ce2┌─────────────WatchID─┐│ 5814563137538961516 │└─────────────────────┘1 rows in set. Elapsed: 0.006 sec. Processed 8.19 thousand rows, 65.54 KB (1.41 million rows/s., 11.27 MB/s.)

The effect is amazing, from 800MB full table scan of 100 million rows to 65KB 8k row scan, the time is also accelerated by more than 40 times.

In addition to querying detailed data, PROJECTION also supports pre-aggregation. Without optimization, the following query will also scan the entire table:

SELECT    UserID,    SearchPhrase,    count()FROM hits_100m_obfuscatedGROUP BY    UserID,    SearchPhraseLIMIT 10Query id: 42c941e0-c15a-4206-9c1b-7350a5a67984┌───────────────UserID─┬─SearchPhrase─────────────────────────────────────────────────┬─count()─┐│    64240392369242065 │                                                              │       1 ││  2542641703475366060 │ galaxy s4 activerstovmamasumi x2                             │       3 ││ 14973463213479722228 │                                                              │      17 ││  6604743450870066038 │                                                              │       1 ││   325929602194382277 │ вес гриппи игре aventity of wars 2 в в играть                │       1 ││  5481644077966220011 │ как леченский рецепты как почему конкая лето москва отдых на │       1 ││  5965198553492672379 │                                                              │       1 ││   119657425828985633 │                                                              │       1 ││  8462750442030450647 │ рулонасточный+статив зомбинет магазин на айресу батл         │       1 ││  7510587892824469257 │ sia 265 сезон 6 серии                                        │       1 │└──────────────────────┴──────────────────────────────────────────────────────────────┴─────────┘10 rows in set. Elapsed: 2.190 sec. Processed 100.00 million rows, 2.44 GB (45.66 million rows/s., 1.11 GB/s.)

Now create another aggregate PROJECTION:

 ALTER TABLE hits_100m_obfuscated ADD PROJECTION agg_p2    (       SELECT          UserID,           SearchPhrase,           count()        GROUP BY UserID, SearchPhrase    )

Since the historical data already exists, manually trigger the materialization:

alter table hits_100m_obfuscated MATERIALIZE PROJECTION agg_p2

After materializing, execute the same query again:

SELECT    UserID,    SearchPhrase,    count()FROM hits_100m_obfuscatedGROUP BY    UserID,    SearchPhraseLIMIT 10Query id: 258e556e-ea5b-43f0-980a-997c02abc233┌───────────────UserID─┬─SearchPhrase─────────────────────────────────────────────────┬─count()─┐│    64240392369242065 │                                                              │       1 ││  2542641703475366060 │ galaxy s4 activerstovmamasumi x2                             │       3 ││ 14973463213479722228 │                                                              │      17 ││  6604743450870066038 │                                                              │       1 ││   325929602194382277 │ вес гриппи игре aventity of wars 2 в в играть                │       1 ││  5481644077966220011 │ как леченский рецепты как почему конкая лето москва отдых на │       1 ││  5965198553492672379 │                                                              │       1 ││   119657425828985633 │                                                              │       1 ││  8462750442030450647 │ рулонасточный+статив зомбинет магазин на айресу батл         │       1 ││  7510587892824469257 │ sia 265 сезон 6 серии                                        │       1 │└──────────────────────┴──────────────────────────────────────────────────────────────┴─────────┘10 rows in set. Elapsed: 1.847 sec. Processed 24.07 million rows, 1.58 GB (13.04 million rows/s., 856.09 MB/s.)

The data scan range has been reduced by three quarters.

Now ClickHouse also provides PROJECTION system table, you can see related storage information:

SELECT    name,    partition,    formatReadableSize(bytes_on_disk) AS bytes,    formatReadableSize(parent_bytes_on_disk) AS parent_bytes,    parent_rows,    rows / parent_rows AS ratioFROM system.projection_partsQuery id: 2887b0e1-b984-4274-862c-0b59c68693c5┌─name───┬─partition─┬─bytes──────┬─parent_bytes─┬─parent_rows─┬──────ratio─┐│ agg_p2 │ 201307    │ 490.40 MiB │ 14.06 GiB    │   100000000 │ 0.24070565 ││ p1     │ 201307    │ 4.95 GiB   │ 18.53 GiB    │   100000000 │     1      │└────────┴───────────┴────────────┴──────────────┴─────────────┴────────────┘

The essence of PROJECTION is to exchange space for time, which is still very cost-effective.

PROJECTION also supports deleted DDL:

 ALTER TABLE hits_100m_obfuscated DROP PROJECTION p1 ALTER TABLE hits_100m_obfuscated DROP PROJECTION agg_p2

In addition to creating through ALTER, it can also be created when CREATE TABLE, for example:

CREATE TABLE xxx (     `event_key` String,     `user` UInt32,     `dim1` String,     PROJECTION p1     (         SELECT             groupBitmap(user),             count(1)         GROUP BY dim1     ) ) ENGINE = MergeTree() ORDER BY (event_key, user) 

Through the example just now, you can find that the use of PROJECTION is insensitive when querying, and ClickHouse will automatically match according to the submitted SQL statement.

Then you must be curious, what are the matching rules? There are several principles:

1. Set SET allow_experimental_projection_optimization = 1

2. The returned data rows are less than the total number of base tables

3. More than half of the partition parts covered by the query

4. Where must be a subset of GROUP BY in the PROJECTION definition

5. GROUP BY must be a subset of GROUP BY in the PROJECTION definition

6. SELECT must be a subset of SELECT in PROJECTION definition

7. When matching multiple PROJECTIONs, select the one that reads the least part

If you don't know whether the query matches PROJECTION, there are two ways to check:

1. Use explain, for example:

EXPLAINSELECT WatchIDFROM hits_100m_obfuscatedWHERE WatchID = 5814563137538961516Query id: bf008e69-fd68-4928-83f6-a57a2d84e286┌─explain───────────────────────────────────────────────────────────────────┐│ Expression ((Projection + Before ORDER BY))                               ││   SettingQuotaAndLimits (Set limits and quota after reading from storage) ││     ReadFromStorage (MergeTree(with 0 projection p1))                     │└───────────────────────────────────────────────────────────────────────────┘

Seeing MergeTree(with 0 projection p1) means that this SQL query will hit PROJECTION

2. Check the execution log:

 (SelectExecutor): Choose normal projection p3 (SelectExecutor): projection required columns: dim1, dim3, event_time, dim2, event_key, user (SelectExecutor): Key condition: (column 0 in ['dim12', 'dim12'])

Seeing Choose xxx projection means that this SQL query has hit PROJECTION

Using PROJECTION, we only need to face a bottom table query, which not only has the performance of the original materialized view, but also avoids the maintenance cost and data consistency problems, which is simply invincible.

Well, that's all for today's sharing. With PROJECTION, it can be said that ClickHouse is even more powerful. In some original scenarios, we can say goodbye to ETL and materialized views.

———————————————————————————

This is a reference book that can help readers deeply understand and fully grasp the operating principle of ClickHouse and carry out practical development. It covers the background, development history, core concepts, basic functions, operating principles, practical guidance and other dimensions of ClickHouse. , especially in the core part of ClickHouse - the MergeTree table engine and distribution, the book explains its implementation principles and application skills in detail.

———————————————————————————

Gift book at the end of the article:

There are a lot of new friends recently. In order to give back to the fans of the BigData official account, here is a book, "ClickHouse Principle Analysis and Application Practice" , the lottery will be held at 20:00 on September 22. Follow the official account and reply that I want to learn to participate in the lottery

Guess you like

Origin blog.csdn.net/weixin_47158466/article/details/120348000