Doris cluster machine planning The following is the server planning and configuration information of the Doris cluster, which currently uses 3-node mixed deployment mode. FE3 instances and BE3 instances form the minimum configuration cluster.
server name | Server IP | Role |
Doris-node01 | 10.19.162.103 | FE、BE |
Doris-node02 | 10.19.162.104 | FE、BE |
Doris-node03 | 10.19.162.106 | FE、BE |
Introduction to Doris
Official:Homepage - Apache Doris
Introduction to core components
-
Frontend(FE)
Java language development, storage and maintenance of cluster metadata, responsible for receiving and parsing user query requests, planning query calculations, and scheduling query results.
-
Backend(BE)
C++ language development, the computing and storage nodes of the Doris system, based on the physical execution plan generated by FE, then perform queries (distributed, multi-node parallel execution of queries, unified summary), and BE will also store the data as multiple copies.
-
Broker
Broker is an independent stateless process. It encapsulates the file system interface and provides Doris with the ability to read files in remote storage systems, including HDFS, S3, BOS, etc.
Doris port list
Correspondence between Doris image version and compilation environment
Compiled by Doris
The doris image used here is 0.15.0, and the doris version is apache-doris-0.15.0-incubating-src.tar.gz
# 1.拉取镜像 在10.19.162.103宿主机上执行
docker pull apache/doris:build-env-for-0.15.0
# 2. 将容器中的 maven 下载的包保存到宿主机本地指定的文件中,避免重复下载,同时会将编译的 Doris 文件保存到宿主机本地指定的文件,方便部署
docker run -it -v /root/dorisenv/.m2:/root/.m2 -v /root/dorisenv/incubator-doris-DORIS-0.13-release/:/root/incubator-doris-DORIS-0.13-release/ apache/doris:build-env-for-0.15.0
# /root/dorisenv/.m2和/root/dorisenv/incubator-doris-DORIS-0.13-release/是宿主机挂载的目录,如下图所示
Note: The following steps 3, 4, and 5 are all executed inside the docker instance.
# 3. 在镜像中下载doris安装包,下载路径可以点击如下图位置查看
wget https://mirrors.tuna.tsinghua.edu.cn/apache/doris/0.15.0-incubating/apache-d
The above source code is 15M in size. After downloading, proceed to step 4.
# 4.解压
tar -zxvf apache-doris-0.15.0-incubating-src.tar.gz
#5.切换JDK8(由于本demo是JDK8环境,需切换到 JDK8,如果你用JDK11 16也可以切换到对应版本)
$ alternatives --set java java-1.8.0-openjdk.x86_64
$ alternatives --set javac java-1.8.0-openjdk.x86_64
$ export JAVA_HOME=/usr/lib/jvm/java-1.8.0
# 6. 进入src目录,然后编译
sh build.sh
Screenshot 1 of part of the build process:
Screenshot 2 of part of the build process:
20 minutes after building:
The files generated after compilation are in the following directory:
Where be is the backend, fe is the frontend, and udf is the user-defined function directory.
If the following error occurs:
Plugin net.sourceforge.czt.dev:cup-maven-plugin:1.6-cdh or one of its dependencies could not be resolved: Could not find artifact net.sourceforge.czt.dev:cup-maven-plugin:jar:1.6-cdh in spring-plugins (https://repo.spring.io/plugins-release/)
Modify fe/pom.xml. The pom.xml before adjustment is as follows:
<repositories>
<repository>
<id>central</id>
<name>central maven repo https</name>
<url>https://repo.maven.apache.org/maven</url>
</repository>
<!-- for java-cup -->
<repository>
<id>cloudera-thirdparty</id>
<url>https://repository.cloudera.com/content/repositories/third-party/</url>
</repository>
<!-- for bdb je -->
<repository>
<id>oracleReleases</id>
<url>http://download.oracle.com/maven</url>
</repository>
</repositories>
<pluginRepositories>
<!-- for cup-maven-plugin -->
<pluginRepository>
<id>spring-plugins</id>
<url>https://repository.cloudera.com/artifactory/ext-release-local</url>
</pluginRepository>
<pluginRepository>
<id>cloudera-public</id>
<url>https://repository.cloudera.com/artifactory/public/</url>
</pluginRepository>
</pluginRepositories>
The adjusted pom.xml is as follows:
<repositories>
<repository>
<id>central</id>
<name>central maven repo https</name>
<url>https://repo.maven.apache.org/maven2</url>
</repository>
<!-- for java-cup -->
<repository>
<id>cloudera-public</id>
<url>https://repository.cloudera.com/artifactory/public/</url>
</repository>
<!-- for bdb je -->
<repository>
<id>oracleReleases</id>
<url>https://download.oracle.com/maven</url>
</repository>
</repositories>
<pluginRepositories>
<pluginRepository>
<id>spring-plugins</id>
<url>https://repository.cloudera.com/artifactory/ext-release-local</url>
</pluginRepository>
<!-- for cup-maven-plugin -->
<pluginRepository>
<id>cloudera-public</id>
<url>https://repository.cloudera.com/artifactory/public/</url>
</pluginRepository>
</pluginRepositories>
Continue building and execute the command on the current machine of the compilation environment:
cd /root/apache-doris-0.15.0-incubating-src
sh build.sh
FE installation
# 6 数据导出
#查看所有容器,包括运行与停止的容器
docker ps -a
#启动容器
docker start laughing_haslett
#进入容器
docker attach laughing_haslett
#将编译好的文件拷出
docker cp laughing_haslett:/root/apache-doris-0.15.0-incubating-src/output/ /root/dorisenv
#为了保持环境一致,这里将容器的jdk导出,并配置环境变量
docker cp laughing_haslett:/usr/lib/jvm/java-11-openjdk-11.0.12.0.7-0.el7_9.x86_64/ /usr/lib/jvm/
#scp到其他机器
scp -r /usr/lib/jvm/java-11-openjdk-11.0.12.0.7-0.el7_9.x86_64/ [email protected]:/usr/lib/jvm/
# 配置环境变量
vi /etc/profile
export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-11.0.12.0.7-0.el7_9.x86_64
export CLASSPATH=$:CLASSPATH:$JAVA_HOME/lib/
export PATH=$PATH:$JAVA_HOME/bin
source /etc/profile
Configure FE
-
Copy the FE deployment file to the specified node
Copy the fe folder under the output generated by source code compilation to the FE node/root/apache-doris-0.15 path
cp -r fe /root/apache-doris-0.15
-
Configure environment variables
vi /etc/profile
# 放在JAVA_HOME上方
#Doris hoem
export DORIS_HOME=/root/apache-doris-0.15
export PATH=$PATH:$DORIS_HOME/bin
#重新加载环境变量
source /etc/profile
Configure fe.conf
Modify ip binding:
Distribute the installation directory to two other nodes
scp -r /root/apache-doris-0.15 [email protected]:/root/apache-doris-0.15
After distribution, configure environment variables and modify priority_networks
Start FE, 10.19.162.103 starts as leader role
sh /root/apache-doris-0.15/fe/bin/start_fe.sh --daemon
#启动失败则配置JAVA_HOME使用 source /etc/profie
Logs are stored in the fe/log/ directory by default
After startup is complete, it can be accessed through http://10.19.162.103:8030/Configuration:
Use datagrip to change password
- Delete the mysql library file that comes with the operating system
rpm -qa | grep mariadb
rpm -e --nodeps mariadb-libs-5.5.68-1.el7.x86_64
Connect to doris (password is empty):
set password:
SET PASSWORD FOR 'root' = PASSWORD('root')
BE installation
Configure BE
-
Copy the BE deployment file to the specified node
Copy the fe folder under the output generated by source code compilation to the FE node/root/apache-doris-0.15 path
cp -r be /root/apache-doris-0.15
-
Create directory storage_root_path
mkdir -p /root/apache-doris-0.15/be/storage1
mkdir -p /root/apache-doris-0.15/be/storage2
After execution, the directory information is as follows:
- Modify conf/be.conf, add IP and set storage space
#设置ip
priority_networks = 10.19.162.110/24
#以下的10,表示最大的存储空间为10Gb
storage_root_path = /root/apache-doris-0.15/be/storage1,10;/root/apache-doris-0.15/be/storage2
-
Add the BE node port as the heartbeat_service_port port on the BE, the default is 9050
ALTER SYSTEM ADD BACKEND "10.19.162.111:9050";
ALTER SYSTEM ADD BACKEND "10.19.162.110:9050";
-
Modify the number of open files
vi /etc/security/limits.conf
* soft nofile 65535
* hard nofile 65535
* soft nproc 65535
* hard nproc 65535
This method needs to restart the machine to take effect (All BE nodes need to be configured), otherwise the startup will not be successful.
- Configure BE’s IP address
-
Will be distributed to other nodes 10.19.162.104, 10.19.162.106, and modify the priority_networks of be.conf
scp -r /root/apache-doris-0.15/be/ [email protected]:/root/apache-doris-0.15/
• Log in to machines 103, 104, and 106 to start BE (note that storage_root_path = /root/apache-doris-0.15/be/storage1,10;/root/apache-doris-0.15/be/ in be.conf in this step storage2 must be configured properly, otherwise a startup error will occur, and storage1 and storage2 must be deleted and re-established)
# 可以多启动一次,如果启动报pid冲突,说明启动成功,否则表示启动失败,可以重启机器再次启动试试,如果还是不行,可以查看日志,查看具体原因
sh /root/apache-doris-0.15/be/bin/start_be.sh --daemon
Logs are stored in the be/log/ directory by default
[root@lab-hosta-vm02 log]# pwd
/root/apache-doris-0.15/be/log
[root@lab-hosta-vm02 log]# ls -l
总用量 4808
lrwxrwxrwx 1 root root 27 10月 15 14:40 be.INFO -> be.INFO.log.20221015-144010
-rw-r--r-- 1 root root 4860906 10月 16 23:32 be.INFO.log.20221015-144010
-rw-r--r-- 1 root root 165 10月 16 23:26 be.out
lrwxrwxrwx 1 root root 30 10月 16 09:03 be.WARNING -> be.WARNING.log.20221016-090328
-rw-r--r-- 1 root root 52587 10月 16 23:23 be.WARNING.log.20221016-090328
[root@lab-hosta-vm02 log]# tail -f be.INFO.log.20221015-144010
• After the three servers BE are started, use Mysql-client or other client tools to log in to 103FE, and add backends nodes through SQL commands
mysql -uroot -h 10.19.162.103 -P 9030 -p
#密码为root
#或者通过客户端登录
#其中host为Broker所在节点ip,port为Broker配置文件(apache_hdfs_broker.conf)中的broker_ipc_port
ALTER SYSTEM ADD BACKEND "10.19.162.103:9050";
ALTER SYSTEM ADD BACKEND "10.19.162.104:9050";
ALTER SYSTEM ADD BACKEND "10.19.162.106:9050";
Check again after adding
SHOW PROC '/backends'
If the Alive field is true, it means that the BE status is normal and has joined the cluster.
Broker node deployment
BROKER is in the form of a plug-in and is independent of Doris deployment. It is recommended that each PE and BE node deploy a Broker. Broker is a process used to access external data sources. The default is HDFS. Upload the compiled hdfs_broker
# 进入容器laughing_haslett
docker start laughing_haslett
docker attach laughing_haslett
cd apache-doris-0.15.0-incubating-src
cd fs_brokers/
cd apache_hdfs_broker/
sh build.sh
After compilation, there is an output directory, where the compiled files are stored.
#将编译好的文件导出 pwd可以查看当前所在目录
docker cp laughing_haslett:/root/apache-doris-0.15.0-incubating-src/fs_brokers/apache_hdfs_broker/output/apache_hdfs_broker/ /root/dorisenv/output/
# 将apache_hdfs_broker放入/root/apache-doris-0.15中,并分发给其他机器
cp -r apache_hdfs_broker/ /root/apache-doris-0.15/
scp -r /root/apache-doris-0.15/apache_hdfs_broker/ [email protected]:/root/apache-doris-0.15/
#分别启动Broker
sh /root/apache-doris-0.15/apache_hdfs_broker/bin/start_broker.sh --daemon
jps view as follows:
-
Add Broker node through SQL command
mysql -uroot -h 10.19.162.103 -P 9030 -p
#密码为root
#或者通过客户端dbeaver之前步骤提到的FE实例
#其中host为Broker所在节点ip,port为Broker配置文件(apache_hdfs_broker.conf)中的broker_ipc_port
alter system add broker broker_030406 "10.19.162.103:8000","10.19.162.104:8000","10.19.162.106:8000";
#查看broker状态
show proc '/brokers';
Note: In a production environment, all instances should be started using daemon processes to ensure that the process will be automatically pulled up after exiting , such as Supervisor (opens new window). If you want to use a daemon to start, in versions 0.9.0 and earlier, you need to modify each start_xx.sh script to remove the last & symbol. Starting from version 0.10.0, just call sh start_xx.sh to start.
Expansion and reduction of FE
Command introduction
-
• Add Follower or Oberser
First connect to the started FE and execute
#oberser只能用来读,follower用来做高可用,leader宕掉follower能重新选举一个leader顶上
alter system add follower "ip:port";
或
alter system add observer "ip:port";
-
Delete FE node
alter system drop follower[observer] "ip:port"
operate
-
Start the FE service of 104 106 (the FE of the 10.19.162.103 machine has been started as the leader role before). Special instructions: The --helper parameter is only required when the follower and observer are started for the first time.
# 登录104、106机器分别启动FE,需要指定103为leader
sh /root/apache-doris-0.15/fe/bin/start_fe.sh --helper 10.19.162.103:9010 --daemon
At this point, all three nodes have started FE instances. In addition, you need to use mysql-client to connect to the FE of the started 103 machine and perform the SQL operation of adding the cluster. Since we expect to deploy 3 FEs, we need to execute the following two commands in the SQL window:
#添加FE
ALTER SYSTEM ADD FOLLOWER "10.19.162.104:9010";
ALTER SYSTEM ADD FOLLOWER "10.19.162.106:9010";
View the FE instance again
show proc '/frontends';
Expansion and reduction of BE
The expansion and reduction process of BE nodes will not affect the current system operation and tasks being executed, and will not affect the performance of the current system. Data balancing occurs automatically. Depending on the size of the existing data in the cluster, the cluster will return to a load-balanced state within a few hours to a day.
-
• Connect to the started FE using mysql-client or client tool (dbeaver)
-
• Add BE nodes by executing the following SQL command
alter system add backend "ip:node"
#例 alter system add backend "10.19.162.103:9050"
-
• Delete BE node
# 删除节点有两种方式:drop和decommission
alter system dropp backend “ndoe01:9050”; #硬删除
或者
alter system decommission backend "ndoe01:9050"; #软删除 推荐使用这个
View all BE
show proc '/backends'
The BE expansion in this case has been completed in the above steps. BE is a 3-node cluster. For details, please refer to the above installation steps of BE: "Adding Broker nodes through SQL commands".
Broker expansion and contraction
There is no hard requirement for the number of Broker instances. Usually one can be deployed on each physical machine. The addition and deletion of Broker can be completed with the following commands.
alter system add broker broker_030406 "10.19.162.104:8000"
alter system add broker broker_030406 "10.19.162.106:8000"
alter system drop all broker broker_030406
The expansion of Broker has been completed in the above steps. For specific operations, please refer to the above steps to install Broker: "Add Broker nodes through SQL commands".
Common commands
sh /root/apache-doris-0.15/fe/bin/start_fe.sh --daemon
sh /root/apache-doris-0.15/be/bin/start_be.sh --daemon
mysql -uroot -h 10.19.162.111 -P 9030 -p
show proc '/frontends';
show proc '/backends';
alter system add backend "10.19.162.111:9050";
alter system dropp backend "10.19.162.111:9050";
alter system decommission backend "10.19.162.110:9050";
SHOW PROC '/backends'\G