Article Directory
ClickHouse is an OLAP open source database produced by Yandex, a fighting nation, referred to as CH/CK, which is currently the fastest OLAP database on the market.
1. Applicable scenarios (OLAP)
- Mostly read requests
- Data batch write
- Do not modify the added data
- The query is multiple rows and few columns
- Storage wide table
- Fewer queries (around 100/s per unit)
- For simple queries, a delay of approximately 50 milliseconds is allowed
- No large fields (for example, 60 bytes per URL)
- Large number of rows in a single query
- No transaction requirements, low data consistency requirements
- Data can be stored in the memory of a single server after being filtered or aggregated
Summary: Mass data, but do not want the storage space consumption of a single node to be too high. For wide tables, for business convenience, many related data columns may be integrated into one table. SQL-based query mode improves the applicability and portability of the program.
2. Features
- Vector computing, and supports multi-core CPU parallel computing, and strive to squeeze CPU performance when executing each SQL.
- Columnar storage, high data compression ratio
- Based on the Shared nothing architecture, it supports distributed solutions.
- Compatible with most SQL syntax, and its syntax is especially similar to MySQL.
- Support primary key
- index
- Online calculation
- Support approximate calculation
- Support master-slave replication architecture
- Real-time data update
Three, restrictions
1. Does not support affairs
2. High-frequency, low-latency updates and deletions are not applicable, only batch deletion and modification are supported
3. Sparse index, not suitable for point query
Fourth, performance
- Single big query
Data is in page cache
Complex query 2-10GB/s (uncompressed), simple query 30GB/s
Data is not in page cache
The processing speed is equal to the disk IO* compression ratio
Performance is almost linearly expanded in distributed scenarios
- Latency for short queries
Data is in page cache
The primary key query of hundreds of thousands of rows is less than 50ms
Data is not in page cache
HDD: 10ms * field number * data block data volume
- Short query throughput
About 100 times per second
- Write performance
It is recommended to write at most 1 time per second or write more than 1000 lines each time, and the writing speed is 50-200MB/s