ClickHouse column-oriented database management system (installation tutorial)

Table of contents

download link

Install

Production deployment

from deb-packages

Set up Debian repository

Install ClickHouse server and client

Start the ClickHouse server

Migration method for installing deb-packages

Manually download the installation package address

Install Standalone ClickHouse Keeper

Enable and start ClickHouse Keeper

Packages

From RPM package

Set up RPM repository

Install ClickHouse server and client

Start the ClickHouse server

Install Standalone ClickHouse Keeper

Enable and start ClickHouse Keeper

Manually download the installation package address

Tgz Archives

Source code compilation

Install CI-generated binaries

For macOS only: Install using Homebrew

start up

Recommendations for self-service management of ClickHouse


download link

https://github.com/ClickHouse/ClickHouse

Install

1. If you are just getting started and want to see what ClickHouse can do, the easiest way to download ClickHouse locally is to run the following command. It downloads a binary for your operating system that runs ClickHouse server, clickhouse-client, clickhouse-local, ClickHouse Keeper and other tools:

curl https://clickhouse.com/ | sh

2. Run the following command to start the ClickHouse server:

./clickhouse server

The first time you run this script, the necessary files and folders are created in the current directory and the server is started.

3. Open a new terminal and connect to your service using the clickhouse client:

./clickhouse client

ClickHouse client version 23.2.1.1501 (official build).
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 23.2.1 revision 54461.

local-host :)

You can start sending DDL and SQL commands to ClickHouse!

Production deployment

For a production deployment of ClickHouse, choose from one of the following installation options.

from deb- packages

It is recommended to use Debian or Ubuntu official precompiled debpackages. Run the following command to install the package:

Set up Debian repository

sudo apt-get install -y apt-transport-https ca-certificates dirmngr
GNUPGHOME=$(mktemp -d)
sudo GNUPGHOME="$GNUPGHOME" gpg --no-default-keyring --keyring /usr/share/keyrings/clickhouse-keyring.gpg --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 8919F6BD2B48D754
sudo rm -r "$GNUPGHOME"
sudo chmod +r /usr/share/keyrings/clickhouse-keyring.gpg

echo "deb [signed-by=/usr/share/keyrings/clickhouse-keyring.gpg] https://packages.clickhouse.com/deb stable main" | sudo tee \
    /etc/apt/sources.list.d/clickhouse.list
sudo apt-get update

Install ClickHouse server and client

sudo apt-get install -y clickhouse-server clickhouse-client

Start the ClickHouse server

sudo service clickhouse-server start
clickhouse-client # or "clickhouse-client --password" if you've set up a password.

Migration method for installing deb-packages

sudo apt-key del E0C56BD4
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 8919F6BD2B48D754
echo "deb https://packages.clickhouse.com/deb stable main" | sudo tee \
    /etc/apt/sources.list.d/clickhouse.list
sudo apt-get update

sudo apt-get install -y clickhouse-server clickhouse-client

sudo service clickhouse-server start
clickhouse-client # or "clickhouse-client --password" if you set up a password.
Manually download the installation package address

https://packages.clickhouse.com/deb/pool/main/c/

Install Standalone ClickHouse Keeper

In a production environment, it is highly recommended to run ClickHouse Keeper on dedicated nodes. In a test environment, if you decide to run ClickHouse Server and ClickHouse Keeper on the same server, you do not need to install ClickHouse Keeper as it is included with ClickHouse Server. This command is only required on standalone ClickHouse Keeper servers.
sudo apt-get install -y clickhouse-keeper
Enable and start ClickHouse Keeper
sudo systemctl enable clickhouse-keeper
sudo systemctl start clickhouse-keeper
sudo systemctl status clickhouse-keeper
Packages
  • clickhouse-common-static -Install ClickHouse compiled binaries.
  • clickhouse-serverclickhouse-server-Create symlinks  for and install default server configuration.
  • clickhouse-client - clickhouse-clientCreate symbolic links to other client related tools. and install the client configuration file.
  • clickhouse-common-static-dbg -Install ClickHouse compiled binaries and debugging information.
  • clickhouse-keeper - Used to install ClickHouse Keeper on a dedicated ClickHouse Keeper node. If you are running ClickHouse Keeper on the same server as your ClickHouse server, you do not need to install this package. Install ClickHouse Keeper and default ClickHouse Keeper configuration files.

If you need to install a specific version of ClickHouse, you must install all packages of the same version: 

sudo apt-get install clickhouse-server=21.8.5.7 clickhouse-client=21.8.5.7 clickhouse-common-static=21.8.5.7

From RPM package

It is recommended to use the official precompiled rpmpackages for CentOS, RedHat and all other rpm-based Linux distributions.

Set up RPM repository

First, you need to add the official repository:

sudo yum install -y yum-utils
sudo yum-config-manager --add-repo https://packages.clickhouse.com/rpm/clickhouse.repo

For zyppersystems with package managers (openSUSE, SLES):

sudo zypper addrepo -r https://packages.clickhouse.com/rpm/clickhouse.repo -g
sudo zypper --gpg-auto-import-keys refresh clickhouse-stable

Afterwards, any yum installcan be zypper installreplaced. To specify a specific version, add at the end of the package name -$VERSION, for example: clickhouse-client-22.2.2.22.

Install ClickHouse server and client

sudo yum install -y clickhouse-server clickhouse-client

Start the ClickHouse server

sudo systemctl enable clickhouse-server
sudo systemctl start clickhouse-server
sudo systemctl status clickhouse-server
clickhouse-client # or "clickhouse-client --password" if you set up a password.

Install Standalone ClickHouse Keeper

In a production environment, it is highly recommended to run ClickHouse Keeper on dedicated nodes. In a test environment, if you decide to run ClickHouse Server and ClickHouse Keeper on the same server, you do not need to install ClickHouse Keeper as it is included with ClickHouse Server. This command is only required on standalone ClickHouse Keeper servers

sudo yum install -y clickhouse-keeper

Enable and start ClickHouse Keeper

sudo systemctl enable clickhouse-keeper
sudo systemctl start clickhouse-keeper
sudo systemctl status clickhouse-keeper
Manually download the installation package address

https://packages.clickhouse.com/rpm/stable/

Tgz Archives

It is recommended to use official precompiled tgzarchives for all Linux distributions, in which installation debor rpmpackages are not possible.

The required version can be downloaded from the warehouse https://packages.clickhouse.com/tgz/curl or wget. Afterwards, the downloaded archive should be unpacked and the script installed. Example of the latest stable version:

LATEST_VERSION=$(curl -s https://packages.clickhouse.com/tgz/stable/ | \
    grep -Eo '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' | sort -V -r | head -n 1)
export LATEST_VERSION

case $(uname -m) in
  x86_64) ARCH=amd64 ;;
  aarch64) ARCH=arm64 ;;
  *) echo "Unknown architecture $(uname -m)"; exit 1 ;;
esac

for PKG in clickhouse-common-static clickhouse-common-static-dbg clickhouse-server clickhouse-client clickhouse-keeper
do
  curl -fO "https://packages.clickhouse.com/tgz/stable/$PKG-$LATEST_VERSION-${ARCH}.tgz" \
    || curl -fO "https://packages.clickhouse.com/tgz/stable/$PKG-$LATEST_VERSION.tgz"
done

tar -xzvf "clickhouse-common-static-$LATEST_VERSION-${ARCH}.tgz" \
  || tar -xzvf "clickhouse-common-static-$LATEST_VERSION.tgz"
sudo "clickhouse-common-static-$LATEST_VERSION/install/doinst.sh"

tar -xzvf "clickhouse-common-static-dbg-$LATEST_VERSION-${ARCH}.tgz" \
  || tar -xzvf "clickhouse-common-static-dbg-$LATEST_VERSION.tgz"
sudo "clickhouse-common-static-dbg-$LATEST_VERSION/install/doinst.sh"

tar -xzvf "clickhouse-server-$LATEST_VERSION-${ARCH}.tgz" \
  || tar -xzvf "clickhouse-server-$LATEST_VERSION.tgz"
sudo "clickhouse-server-$LATEST_VERSION/install/doinst.sh" configure
sudo /etc/init.d/clickhouse-server start

tar -xzvf "clickhouse-client-$LATEST_VERSION-${ARCH}.tgz" \
  || tar -xzvf "clickhouse-client-$LATEST_VERSION.tgz"
sudo "clickhouse-client-$LATEST_VERSION/install/doinst.sh"

Source code compilation

To compile ClickHouse manually, follow the instructions for Linux or macOS .

You can compile packages and install them, or use the program without installing the packages.

  Client: <build_directory>/programs/clickhouse-client
  Server: <build_directory>/programs/clickhouse-server

You need to manually create the data and metadata folders and create them for the required users. Their paths can be changed in the server configuration (src/programs/server/config.xml), by default they are:

  /var/lib/clickhouse/data/default/
  /var/lib/clickhouse/metadata/default/

 On Gentoo, you can emerge clickhouseinstall ClickHouse from source using .

Install CI-generated binaries

ClickHouse's continuous integration (CI) infrastructure generates dedicated build  repositories for each commit in ClickHouse , e.g. cleaned builds , unoptimized (debug) builds, cross-compiled builds, etc. While such builds are usually only useful during development, they can also be interesting to users in some cases.

Because ClickHouse's CI continues to evolve over time, the exact steps for downloading the CI-generated version may vary. Additionally, CI may delete build artifacts that are too old, making them unavailable for download.

For example, to download the aarch64 binaries for ClickHouse v23.4, follow these steps:

  • Find the GitHub pull request for version v23.4: Release pull request for branch 23.4
  • Click "Commit" and then click something like "Update auto-generated versions to 23.4.2.1 and contributors" for the specific version you want to install.
  • Click the green checkbox/yellow dot/red cross to open the CI checkbox list.
  • Clicking "Details" next to "ClickHouse Build Check" in the list will open a page similar to this one
  • Find the line with compiler="clang-*-aarch 64" - there are multiple lines.
  • Download these built artifacts.

Download the binaries for very old x86-64 systems that don't support SSE 3 or for old ARM systems that don't support  ARMv8.1 -A support, open a pull request and look for the CI check "BuilderBinAmd 64 Compat" "BuilderBinAarch64V80Compat". Then Click "Details", open the "Builds" folder, scroll to the end and find the message "Note: Build URL  https://s3.amazonaws.com/clickhouse/builds/PRs/.../.../binary_aarch64_v80compat/ clickhouse” . You can then click on the link to download the build.

For macOS only: Install using Homebrew

To install ClickHouse using a popular brewpackage manager, follow the instructions listed in the ClickHouse Homebrew faucet .

start up

To start the server as a daemon, run:

$ sudo clickhouse start

There are other ways to run ClickHouse:

$ sudo service clickhouse-server start

If you don't have servicethe command, use

$ sudo /etc/init.d/clickhouse-server start

If you have systemctlthe command, run as

$ sudo systemctl start clickhouse-server.service

View /var/log/clickhouse-server/the logs in the directory.

If the server does not start, check /etc/clickhouse-server/config.xmlthe configuration in the file.

You can also start the server manually from the console:

$ clickhouse-server --config-file=/etc/clickhouse-server/config.xml

In this case, the log will be printed to the console, which is very convenient during development. If the configuration file is in the current directory, no --config-fileparameters need to be specified. By default, it is used ./config.xml.

ClickHouse supports access restriction settings. They are located users.xmlin the file (next to each other config.xml). By default, defaultusers are allowed access from anywhere without a password. See user/default/networks. For more information, see the "Configuration Files" section

After starting the server, you can connect to it using a command line client:

$ clickhouse-client

By default, it localhost:9000connects to on behalf of the user defaultwithout a password. It can also be used --hostto connect to remote servers using parameters.

The terminal must use UTF-8 encoding. For more information, see the "Command Line Client" section

example:

$ ./clickhouse-client
ClickHouse client version 0.0.18749.
Connecting to localhost:9000.
Connected to ClickHouse server version 0.0.18749.

:) SELECT 1

SELECT 1

┌─1─┐
│ 1 │
└───┘

1 rows in set. Elapsed: 0.003 sec.

:)

Congratulations, the system works!

To continue the experiment, you can download one of the test datasets or go through the tutorial .

https://clickhouse.com/docs/en/tutorial

Recommendations for self-service management of ClickHouse

ClickHouse runs on any Linux, FreeBSD or macOS and uses x86-64, ARM or PowerPC 64 LE CPU architecture.

ClickHouse uses all available hardware resources to process data.

ClickHouse tends to work more efficiently using a large number of cores at a lower clock rate than using fewer cores at a higher clock rate.

We recommend using at least 4GB of RAM to perform non-trivial queries. The ClickHouse server can run with less RAM, but queries will abort frequently.

The amount of RAM required usually depends on:

  • Query complexity.
  • The amount of data processed in the query.

To calculate the amount of RAM required, you can estimate the size of temporary data for GROUP BY , DISTINCT , JOIN , and other operations you use.

To reduce memory consumption, ClickHouse can swap temporary data to external storage. See GROUP BY in External Storage for details .

We recommend disabling the operating system's swap file in production environments.

ClickHouse binaries require at least 2.5 GB of disk space to install.

The storage capacity required for your data can be calculated individually based on

  • Estimate of data volume.

    You can sample the data and get the average size of a row from it. This value is then multiplied by the number of rows planned to be stored.

  • Data compression factor.

    To estimate the data compression factor, load a sample of the data into ClickHouse and compare the actual size of the data to the stored table size. For example, clickstream data is typically compressed 6-10 times.

To calculate the final amount of data to store, apply the compression factor to the estimated amount of data. If you plan to store data in multiple replicas, multiply the estimated volume by the number of replicas.

For distributed ClickHouse deployments (clusters), we recommend at least a 10G level network connection.

Network bandwidth is critical for processing distributed queries with large amounts of intermediate data. Additionally, network speed can affect the copying process.

Guess you like

Origin blog.csdn.net/u012206617/article/details/133034245