How to use clickhouse

1 Download the jar

https://packagecloud.io/Altinity/clickhouse/

Insert picture description here

vim /etc/security/limits.conf

在文件末尾添加
*soft nofile 65536
*hard nofile 65536
*soft nproc 131072
*hard nproc 131072

Install dependent files

yum install -y libtool
yum install -y *uninxODBC*

2 Install clickhouse

rpm -ivh clickhouse-server-common-20.8.3.18-1.el7.x86_64.rpm
rpm -ivh clickhouse-server-20.8.3.18-1.el7.x86_64.rpm
rpm -ivh clickhouse-common-static-20.8.3.18-1.el7.x86_64.rpm
rpm -ivh clickhouse-client-20.8.3.18-1.el7.x86_64.rpm

3 Start and close clickhouse

service clickhouse-server start
service clickhouse-server stop

4 common parameters of clickhouse client

Insert picture description here
Example

 clickhouse-client -q 'show databases;'
 clickhouse-client -d system -q 'show tables;'

The database file is generally csv

The storage location of clickhouse data

/var/lib/clickhouse/data

5. Use clickhouse client

clickhouse-client #进入clickhouse
exit   #离开clickhouse

6. Data Type

Insert picture description here
Clickhouse does not have a bool type,
you can use enum instead

See the official website for details

https://clickhouse.tech/docs/zh/sql-reference/data-types/float/

The table engine is strictly case sensitive

7.TinyLog

The simplest table engine, used to store data on disk. Each column is stored in a separate compressed file. When writing, the data will be appended to the end of the file.

Concurrent data access is not subject to any restrictions:

  • If you read from the table and write in different queries at the same time, the read operation will throw an exception
  • If you write to the tables in multiple queries at the same time, the data will be destroyed.

The typical usage of this table engine is write-once: first write data only once, and then read it as many times as needed. The query is executed in a single stream. In other words, this engine is suitable for relatively small tables (a maximum of 1,000,000 rows is recommended). If you have many small tables, this table engine is suitable because it is simpler than the Log engine (fewer files need to be opened). When you have a large number of small tables, it may cause poor performance, but if you have already used it in other DBMSs, you may find it easier to switch to using TinyLog type tables. Indexes are not supported.

In Yandex.Metrica, the TinyLog table is used for intermediate data processing in small batches.

8. Memory table

The Memory engine stores data in RAM in an uncompressed form. The data is stored entirely in the form obtained when it is read. In other words, it is easy to read from this table. Concurrent data access is synchronous. Small lock range: read and write operations will not block each other. Indexes are not supported. The query is parallelized. The maximum rate is reached on simple queries (more than 10 GB/sec), because there is no disk read, no need to decompress or deserialize data. (It is worth noting that in many cases, the performance of the MergeTree engine is almost as high). When the server is restarted, the data in the table disappears and the table becomes empty. Generally, it is unreasonable to use this table engine. However, it can be used for testing and queries that require the highest performance on a relatively small number of rows (up to about 100,000,000).

The Memory engine is used by the system to query external data in temporary tables (please refer to the section «External data for request processing») and to implement GLOBAL IN

9.MergeTree

See official website

https://clickhouse.tech/docs/zh/engines/table-engines/mergetree-family/mergetree/

10. Clickhouse interacts with hdfs

See official website

https://clickhouse.tech/docs/zh/sql-reference/table-functions/hdfs/

Guess you like

Origin blog.csdn.net/qq_36382679/article/details/114629902
Recommended