Install and deploy ByConity on the Kubennetes cluster cloud

Understand ByConity rapid deployment in one article

Preface

ByConity is ByConity's open source data warehouse system for modern data stacks. It applies a large number of mature database technologies, such as column storage engine, MPP execution, intelligent query optimization, vectorization execution, Codegen, indexing, and data compression. It is suitable for Online Analytical Processing (OLAP) scenarios and lightly loaded data warehouse scenarios, including but not limited to interactive analysis, real-time APP monitoring, stream data processing and analysis, etc.

Below we introduce how to deploy and run ByConity on the public network through detailed graphics and text.

Configure deployment

2.1 Resource preparation

According to official recommendations, in a test environment

Use operating system version: Centos8.2 and use the public network yum source

In the hardware specifications, the local disks of the Worker and Server are mainly used to store temporary data and log files during writing. At the same time, the local disk of the Worker also stores the cache of data. Therefore, the size of the disk needs to be based on the configured DiskCache size and write Determined by the amount of data.

My deployment is shown below, in order to meet the performance requirements of each component

Component name CPU Memory harddisk network Number of instances
TSO 1 1G 5G Gigabit Ethernet 1
Server 8 32G 100G Gigabit Ethernet 1
Worker 4 16G 110G Gigabit Ethernet 1
DaemonManager 1 2G 40G Gigabit Ethernet 1
ResourceManager 1 2G 40G Gigabit Ethernet 1

All cloud resources are deployed on Huawei Cloud

image-20230925000358484

2.2 Basic server configuration

2.2.1 Install and set up kubectl in the local environment

kubectl is a Kubernetes command line tool that can communicate with a Kubernetes cluster through a command line interface or script and perform various operations, including:

  1. Deploy and manage applications: kubectl can use YAML or JSON files to define and create Kubernetes resource objects such as deployments, services, Pods, replica sets, configuration maps, etc. You can use kubectl to create, update, delete, and view these resources, as well as monitor their status and logs.
  2. Expand and manage the cluster: kubectl can manage various components of the Kubernetes cluster through the command line, such as nodes, namespaces, storage volumes, service accounts, etc. You can use kubectl to expand the size of your cluster, add or remove nodes, and perform operations related to cluster management.
  3. Debugging and Troubleshooting: kubectl provides various commands and options for diagnosing and debugging issues in Kubernetes clusters. You can view Pod logs, execute commands entering the container, obtain cluster events, etc.
  4. Resource monitoring and adjustment: kubectl can be used to view the status of Kubernetes clusters and resources, monitor resource usage, expand or reduce the number of resource copies, etc.

kubectl provides powerful functionality and flexibility to effectively manage and operate Kubernetes clusters.

Download the latest release version with the following command:

curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"

Download the kubectl checksum file:

curl -LO "https://dl.k8s.io/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl.sha256"

Verify the kubectl executable file based on the checksum file:

echo "$(cat kubectl.sha256)  kubectl" | sha256sum --check

When the verification passes, the output is:

kubectl: OK

When verification fails, sha256 will exit with a non-zero value and print the following output:

kubectl: FAILED
sha256sum: WARNING: 1 computed checksum did NOT match

Install kubectl

sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl

Perform tests to ensure that the installed version is up to date:

kubectl version --client

In order for kubectl to discover and access the Kubernetes cluster, you need a kubeconfig file, which is automatically generated when kube-up.sh creates the cluster or successfully deploys a Minikube cluster. Usually, kubectl configuration information is stored in the file ~/.kube/config.

Check whether kubectl is configured appropriately by getting the cluster status:

kubectl cluster-info

If a URL is returned, it means kubectl successfully accessed the cluster.

2.2.2 Install helm in the local environment

Helm uses a packaging format called chart. A chart is a collection of files that describe a related set of Kubernetes resources. A single chart might be used to deploy something simple, like a memcached pod, or something complex, like a complete web application stack with HTTP serving, database, cache, etc.

Charts are created by creating files in a specific directory tree, packaging them into versioned compressed packages, and then deploying them.

Helm has an installation script that automatically pulls the latest Helm version and installs it locally.

curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3
chmod 700 get_helm.sh
 ./get_helm.sh

If you want to perform the installation directly, run

curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
2.2.3 Install kind and Docker
2.2.3.1 kind installation

Kind, the abbreviation of Kubernetes-in-docker, is a tool that uses docker containers as "nodes" to deploy K8S cluster environments. The Kind tool is mainly used for testing Kubernetes itself. Currently, many projects that need to be deployed to the Kubernetes environment for testing will choose to use Kind to quickly create a Kubernetes environment in the CI process, then run the relevant test cases, and then delete them.

Enter the following command to install kind

curl -Lo ./kind https://kind.sigs.k8s.io/dl/v0.9.0/kind-linux-amd64
chmod +x ./kind
mv ./kind /usr/local/bin/kind
2.2.3.2 Docker installation

Install docker-ce

yum clean all yum makecache fastyum -y install docker-ce

Start the service through systemctl

systemctl start docker

3. Local Kubernetes cluster

3.1 Use Kind to configure a local Kubernetes cluster

Save code locally

git clone [email protected]:ByConity/byconity-deploy.git
cd byconity-deploy

image-20230925011431614

Create a 1-control-plane, 3-worker Kubernetes cluster. .

kind create cluster --config examples/kind/kind-byconity.yaml

image-20230925011741975

Test to make sure your local kind cluster is ready:

kubectl cluster-info

image-20230925011803850

3.2 Initialize Byconity demo cluster

# Install with fdb CRD first
helm upgrade --install --create-namespace --namespace byconity -f ./examples/kind/values-kind.yaml byconity ./chart/byconity --set fdb.enabled=false

# Install with fdb cluster
helm upgrade --install --create-namespace --namespace byconity -f ./examples/kind/values-kind.yaml byconity ./chart/byconity

image-20230925011935150

Wait until all Pods are ready.

kubectl -n byconity get po

image-20230925012009282

carry out testing

kubectl -n byconity exec -it sts/byconity-server -- bash
root@byconity-server-0:/# clickhouse client

172.16.1.1 :)

execute sql

CREATE DATABASE IF NOT EXISTS test;
USE test;
DROP TABLE IF EXISTS test.lc;
CREATE TABLE test.lc (b LowCardinality(String)) engine=CnchMergeTree ORDER BY b;
INSERT INTO test.lc SELECT '0123456789' FROM numbers(100000000);
SELECT count(), b FROM test.lc group by b;
DROP TABLE IF EXISTS test.lc;
DROP DATABASE test;

4. ByConity evaluation

ByConity is a data warehouse built on ClickHouse designed for modern cloud architecture changes. It adopts a cloud-native architecture design to meet the needs of data warehouse users for flexible expansion, read-write separation, resource isolation and strong data consistency. At the same time, it provides excellent query and write performance. It adopts a large number of mature OLAP technologies, such as column storage engine, MPP execution, intelligent query optimization, vectorization execution, Codegen, indexing and data compression; it also makes special technical innovations for cloud scenarios and storage and computing separation architecture.

There are a variety of overall deployment methods to meet the needs of different users. The deployment is relatively not very complicated and is worth recommending.

Guess you like

Origin blog.csdn.net/qq_43475285/article/details/133191377
Recommended