In "Exploring ClickHouse - Connecting Kafka and Clickhouse" , we explained how to use kafka engin to connect to kafka and read the data in the topic. But I encountered a problem, that is, the data can only be read once. Even if new data is sent to the topic later, the table cannot be read.
To solve this problem, we introduce MaterializedView.
Create table
This table structure is directly borrowed from the table structure in "Exploring ClickHouse - Using Projection to Accelerate Query" .
CREATE TABLE materialized_uk_price_paid_from_kafka ( price UInt32, date Date, postcode1 LowCardinality(String), postcode2 LowCardinality(String), type Enum8('terraced' = 1, 'semi-detached' = 2, 'detached' = 3, 'flat' = 4, 'other' = 0), is_new UInt8, duration Enum8('freehold' = 1, 'leasehold' = 2, 'unknown' = 0), addr1 String, addr2 String, street LowCardinality(String), locality LowCardinality(String), town LowCardinality(String), district LowCardinality(String), county LowCardinality(String) ) ENGINE = MergeTree ORDER BY (postcode1, postcode2, addr1, addr2);
CREATE TABLE materialized_uk_price_paid_from_kafka
(
price
UInt32,
date
Date,
postcode1
LowCardinality(String),
postcode2
LowCardinality(String),
type
Enum8(‘terraced’ = 1, ‘semi-detached’ = 2, ‘detached’ = 3, ‘flat’ = 4, ‘other’ = 0),
is_new
UInt8,
duration
Enum8(‘freehold’ = 1, ‘leasehold’ = 2, ‘unknown’ = 0),
addr1
String,
addr2
String,
street
LowCardinality(String),
locality
LowCardinality(String),
town
LowCardinality(String),
district
LowCardinality(String),
county
LowCardinality(String)
)
ENGINE = MergeTree
ORDER BY (postcode1, postcode2, addr1, addr2)
Query id: 55b16049-a865-4d54-9333-d661c6280a09
Ok.
0 rows in set. Elapsed: 0.005 sec.
CreateMaterializedView
CREATE MATERIALIZED VIEW uk_price_paid_from_kafka_consumer_view TO materialized_uk_price_paid_from_kafka AS SELECT splitByChar(' ', postcode) AS p, toUInt32(price_string) AS price, parseDateTimeBestEffortUS(time) AS date, p[1] AS postcode1, p[2] AS postcode2, transform(a, ['T', 'S', 'D', 'F', 'O'], ['terraced', 'semi-detached', 'detached', 'flat', 'other']) AS type, b = 'Y' AS is_new, transform(c, ['F', 'L', 'U'], ['freehold', 'leasehold', 'unknown']) AS duration, addr1, addr2, street, locality, town, district, county FROM uk_price_paid_from_kafka;
In this way, the data in the kafka topic is cleaned into the materialized_uk_price_paid_from_kafka table.
Inquire
select * from materialized_uk_price_paid_from_kafka;
We are sending the following content to the topic
“{5FA8692E-537B-4278-8C67-5A060540506D}”,“19500”,“1995-01-27 00:00”,“SK10 2QW”,“T”,“N”,“L”,“38”,“”,“GARDEN STREET”,“MACCLESFIELD”,“MACCLESFIELD”,“MACCLESFIELD”,“CHESHIRE”,“A”,“A”
Query table again
select * from materialized_uk_price_paid_from_kafka;