1 Introduction to DataSophon
1.1 Vision of DataSophon
DataSophon is committed to quickly implementing deployment, management, monitoring, and automated operation and maintenance of big data cloud native platforms, helping you quickly build a stable, efficient, and elastically scalable big data cloud native platform.
1.2 What is DataSophon
"The Three-Body Problem", which won the Hugo Award, the highest award in the world's science fiction literature, is well known for its stunning "hard science fiction" style. A very important role in the three-body, Sophon is to expand the nine-dimensional proton two-dimensionally, transform it into a supercomputer through circuit etching, and then switch back to the microscopic eleven-dimensional to monitor every move of human beings, and use Quantum entanglement realizes instant communication and reports to the Trisolaran civilization 4 light years away. To put it bluntly, Sophon is the AI real-time remote monitoring and management platform deployed by the Trisolaran civilization on the earth.
DataSophon is also a similar management platform, but unlike Sophon, the purpose of Sophon is to lock down the basic science of human beings and prevent the explosion of human technology, while DataSophon is dedicated to automatic monitoring, operation and maintenance, and management of big data basic components and nodes , to help you quickly build a stable and efficient big data cluster service
1.3 Main features of DataSophon
① Rapid deployment, can quickly complete the deployment of 300-node big data clusters
② Compatible with complex environments, few dependencies make it easy to adapt to various complex environments
③ Comprehensive and rich monitoring indicators, based on production practice to display the monitoring indicators that users are most concerned about
④ Flexible and convenient alarm service, which can realize user-defined alarm groups and alarm indicators
⑤ Strong scalability, users can integrate or upgrade big data components through configuration
1.4 Overall Architecture
1.5 Integrated components
All integrated components have been tested for compatibility, and run stably on a large data cluster with a scale of 300+ nodes, with a daily processing volume of about 400 billion pieces of data. With massive amounts of data, the cost of tuning major data components is low, and the platform defaults to displaying the configurations that users care about and need to tune.
serial number | name | Version | describe |
---|---|---|---|
1 | HDFS | 3.3.3 | Distributed big data storage |
2 | YARN | 3.3.3 | Distributed resource scheduling and management platform |
3 | ZooKeeper | 3.5.10 | distributed coordination system |
4 | CONSIDERABLE | 1.15.2 | real-time computing engine |
5 | Dolphin Scheduler | 3.1.1 | Distributed and easily scalable visual workflow task scheduling platform |
6 | StreamPark | 1.2.3 | Stream processing extremely fast development framework, a cloud-native platform integrating stream batch & lake storage |
7 | Spark | 3.1.3 | distributed computing system |
8 | Hive | 3.1.0 | Offline Data Warehouse |
9 | Kafka | 2.4.1 | High-throughput distributed publish-subscribe messaging system |
10 | triune | 367 | Distributed Sql interactive query engine |
11 | Doris | 1.1.5 | A new generation of extremely fast full-scenario MPP database |
12 | Hbase | 2.4.16 | Distributed Columnar Storage Database |
13 | Ranger | 2.1.0 | Access Control Framework |
14 | ElasticSearch | 7.16.2 | high performance search engine |
15 | Prometheus | 2.17.2 | High-performance monitoring index collection and alarm system |
16 | Grafana | 9.1.6 | Monitoring Analysis and Data Visualization Suite |
17 | AlertManager | 0.23.0 | Alarm notification management system |
2 Environment preparation
2.0 DataSophon installation package
链接:https://pan.baidu.com/s/1QWTMadCGLiAL-XqeS6AygQ
提取码:2gd2
Select the DataSophon version to be installed, and select the corresponding datasophon-manager version. This article takes the latest DataSophon-1.1.1 version as an example.
2.1 Network requirements
It is required that all components of each machine operate normally and provide the following network port configuration:
components | default port | illustrate |
---|---|---|
DDHApplicationServer | 8081、2551、8586 | 8081 is the http server port, 2551 is the rpc communication port, 8586 is the jmx port |
WorkerApplicationServer | 2552、9100、8585 2552 | rpc communication port, 8585 is the jmx port, 9100 is the host data collector port |
nginx | 8888 | Provide UI communication port |
Note:
① DDHApplicationServer is the API interface layer, that is, the web backend, and is mainly responsible for processing requests from the front-end UI layer. The service uniformly provides RESTful api to provide request services to the outside world.
② WorkerApplicationServer is responsible for executing the instructions sent by DDHApplicationServer, including service installation, start, stop, restart and other instructions.
2.2 configure hosts
All machines in the big data cluster need to be configured with a host.
Configure the hostname: hostnamectl set-hostname hostname
Configure the /etc/hosts file
2.3 Turn off the firewall
2.4 Cluster password-free
In the deployment machine, the DataSophon node and the big data service master node and slave node need to log in without password.
Configure password-free
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys
Secret-free between clusters
ssh-copy-id -i ~/.ssh/id_rsa.pub root@主机
2.5 Environmental requirements
The Jdk environment needs to be installed. It is recommended that the mysql version be 5.7.X and ssl be turned off.
2.6 Create directory
mkdir -p /opt/datasophon/DDP/packages
Upload the downloaded deployment package to the /opt/datasophon/DDP/packages directory as the project deployment package warehouse address.
Note: The default installation package needs to be created in this directory, otherwise it will report an error that the directory cannot be found, as follows:
3 deployment
3.1 Deploy mysql
Note that the mysql ssl function needs to be turned off. During the deployment process, some components will execute sql to generate database tables. There are differences in the configuration of mysql in different environments. You can change the mysql configuration according to the execution of sql.
3.1.1 Turn off ssl
#使用如下命令查看ssl是否关闭,如果have_ssl的值为YES,说明SSL已经开启
SHOW VARIABLES LIKE '%ssl%';
Modify the configuration file my.cnf, and add the following content to the MySQL configuration file my.cnf:
#disable_ssl
skip_ssl
The role of this configuration is to tell MySQL not to use the SSL protocol. Before modifying the configuration file, it is best to back it up, so as not to be unable to recover after making an error.
Restart the mysql service
After modifying the my.cnf file, you need to restart MySQL to make the modification take effect. You can restart MySQL with the following command:
service mysqld restart
Check again, you can find that the value of have_ssl is DISABLED at this time
3.1.2 Execute the initialization script
Execute the following database script:
CREATE DATABASE IF NOT EXISTS datasophon DEFAULT CHARACTER SET utf8;
grant all privileges on *.* to datasophon@"%" identified by 'datasophon' with grant option;
GRANT ALL PRIVILEGES ON *.* TO 'datasophon'@'%';
FLUSH PRIVILEGES;
Execute datasophon.sql under the sql directory of the datasophon-manager installation directory to create a data table.
source datasophon.sql
3.2 Decompression
Unzip datasophon-manager-{version}.tar.gz in the installation directory. After decompression, you can see the following installation directory:
tar -zxvf datasophon-manager-1.1.1.tar.gz
bin: startup script git
conf: configuration file
lib : The jar package that the project depends on
logs: project log storage directory
jmx: jmx plugin
3.3 Install and configure nginx
Note: Versions earlier than 1.1.1 need to perform this step and configure nginx. 1.1.1 and later versions can skip this step
3.3.1 Install dependent packages
yum -y install gcc zlib zlib-devel pcre-devel openssl openssl-devel
3.3.2 Download and decompress
#创建一个文件夹
cd /usr/local
mkdir nginx
cd nginx
#下载tar包
wget http://nginx.org/download/nginx-1.13.7.tar.gz
tar -xvf nginx-1.13.7.tar.gz
3.3.3 install nginx
#进入nginx目录
cd /usr/local/nginx/nginx-1.13.7
#执行命令 考虑到后续安装ssl证书 添加两个模块
./configure --with-http_stub_status_module --with-http_ssl_module
#执行make命令
make
#执行make install命令
make install
3.3.4 Start nginx service
/usr/local/nginx/sbin/nginx -c /usr/local/nginx/conf/nginx.conf
See the following page to indicate that nginx is installed successfully
3.3.5 Configure nginx.conf
# 打开配置文件
vim /usr/local/nginx/conf/nginx.conf
Note: Here you need to get the front-end configuration package dist.zip and decompress it
New configuration
server {
listen 8888;# 访问端口(自行修改)
server_name localhost;
#charset koi8-r;
#access_log /var/log/nginx/host.access.log main;
location / {
root /usr/local/nginx/dist; # 前端解压的 dist 目录地址(自行修改)
index index.html index.html;
}
location /ddh {
proxy_pass http://doris-1:8081; # 接口地址(自行修改)
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header x_real_ipP $remote_addr;
proxy_set_header remote_addr $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_http_version 1.1;
proxy_connect_timeout 4s;
proxy_read_timeout 30s;
proxy_send_timeout 12s;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
#error_page 404 /404.html;
# redirect server error pages to the static page /50x.html
#
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root /usr/share/nginx/html;
}
}
3.3.6 restart nginx
/usr/local/nginx/sbin -s reload
3.4 Modify configuration
Modify the database link configuration in the application.yml configuration file in the conf directory:
spring:
datasource:
type: com.alibaba.druid.pool.DruidDataSource
url: jdbc:mysql:192.168.5.189:3306/datasophon?useUnicode=true&characterEncoding=utf-8
username: root
password: root
driver-class-name: com.mysql.jdbc.Driver
3.4 Start the service
#启动
sh bin/datasophon-api.sh start api
#停止
sh bin/datasophon-api.sh stop api
#重启
sh bin/datasophon-api.sh restart api
After the deployment is successful, you can view the logs, which are stored in the logs folder:
logs/
├── ddh-api.log
├── ddh-api-error.log
|—— api-{hostname}.out
3.5 Visit the page
Access the front-end page address, interface ip (modify by yourself) [http://192.168.xx.xx:8081/ddh, the default username and password are admin/admin123
4 Create a cluster
After logging in to the system page, create a cluster on the cluster management page. DataSophon supports multi-cluster management and grants users cluster administrator privileges.
Click [Create Cluster], enter the cluster name, cluster code (cluster unique identifier), and cluster framework.
After the creation is successful, click [Configure Cluster]:
According to the prompt, enter the host list (note: the host name must be consistent with the host name set in hostnamectl set-hostname in the preparation environment), the default ssh user name is root and the default ssh port is 22.
After the configuration is complete, click [Next], the system starts to connect to the host and check the host environment.
After the host environment verification is successful, click [Next]. The host agent distribution step will automatically distribute the datasophon-worker component and start the WorkerApplicationServer.
After the distribution of the host management agent is complete, click [Next] to start deploying the service.
To initialize and configure the cluster, first choose to deploy three components: AlertManager, Grafana and Prometheus.
Click [Next] to assign the master service role deployment nodes of AlertManager, Grafana and Prometheus services. These three components need to be deployed on the same machine.
Click [Next] to assign the worker and client service roles of AlertManager, Grafana and Prometheus services to deploy nodes. If there are no worker and client service roles, you can skip and click [Next].
Modify the configuration of each service. The system has given the default configuration, and in most cases there is no need to modify it.
Click [Next] to start the service installation, and you can view the service installation progress in real time.
Click [Finish], and click [Enter] on the cluster management page to enter the cluster service component management page.
5 Add service
5.1 Install ZK
Select Add Service, select zk
to assign ZooKeeper master service role deployment nodes, and zk needs 3 or 5 nodes.
Zk has no worker and client service roles, just click [Next] to skip.
Modify the Zk service configuration according to the actual situation.
Click [Next] to install the zk service.
After the installation is successful, you can view the Zookeeper service overview page.
5.2 Install HDFS
Deploy HDFS, where JournalNode needs to deploy three, NameNode deploys two, and ZKFC and NameNode are deployed on the same machine. Click [Next] as shown in the figure below
to select the DataNode deployment node.
Modify the configuration according to the actual situation, such as modifying the DataNode data storage directory.
Click [Next] to start installing Hdfs.
After the installation is successful, you can view the HDFS service overview page.
5.3 Add Yarn service
Deploy YARN, and ResourceManager needs to deploy two for high availability. As shown in the figure below:
Click [Next] and select NodeManager to deploy nodes.
Modify the configuration according to the actual situation.
Wait for the installation to complete
After the installation is successful, you can view the YARN service overview page
5.4 Add hive service
5.4.1 Preparations
1) Create a Hive database in the database.
CREATE DATABASE IF NOT EXISTS hive DEFAULT CHARACTER SET utf8;
grant all privileges on *.* to hive@"%" identified by 'hive' with grant option;
GRANT ALL PRIVILEGES ON *.* TO 'hive'@'%';
FLUSH PRIVILEGES;
5.4.2 install hive
Select the nodes that need to install the hiveserver2 and metastore roles.
Select the nodes that need to install the hiveclient role.
Modify the configuration according to the actual situation
. Wait for the installation to complete. After the installation is successful, you can view the YARN service overview page
5.5 Install dolphinscheduler service
5.5.1 Preparations
1) Initialize the DolphinScheduler database.
CREATE DATABASE dolphinscheduler DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL PRIVILEGES ON dolphinscheduler.* TO 'dolphinscheduler'@'%' IDENTIFIED BY 'dolphinscheduler';
GRANT ALL PRIVILEGES ON dolphinscheduler.* TO 'dolphinscheduler'@'localhost' IDENTIFIED BY 'dolphinscheduler';
flush privileges;
2) Execute dolphinscheduler_mysql.sql in the /opt/datasophon/DDP/packages directory to create a dolphinscheduler database table
5.5.2 install dolphinscheduler
Add dolphinscheduler service
Assign api-server/alert-server/master-server/worker-server roles
Modify the DolphinScheduler configuration according to the actual situation.
Start to install DolphinScheduler. After the installation is successful, you can see the DolphinScheduler overview page.
You can open the DolphinScheduler page through WebUi.
5.6 add trino
Click [Add Service] and select Trino.
Select TrinoCoordinator.
Select Trino Worker. Note: TrinoCoordinator and TrinoWorker should not be deployed on the same machine.
Pay attention to the two configurations of "Trino maximum heap memory" and "each query can use maximum memory on a single node", in which "each query can use maximum memory on a single node" cannot exceed 80% of "Trino maximum heap memory", "The maximum available memory in total" is "the maximum memory available for each query on a single node" * the number of TrinoWorkers.
Click [Next] to start installing Trino. Wait for the installation to complete, you can see the Trino overview page
Select the webui of trino, you can access the connection of trino
5.7 Add doris service
Click [Add Service] and select Doris.
Assign FE service roles to deploy nodes.
Assign BE service roles to deploy nodes.
Modify the Doris configuration as required, among which the FE priority network segment and BE priority network segment need to be configured, such as configuring 172.31.86.0/24
and clicking [Next] to start installing Doris.
5.8 install kafka
Select the node where the Kafka service broker role is installed
and adjust the Kafka parameters according to the actual situation.
After Kafka is successfully installed, you can view Kafka details on the Kakfa service overview page.
5.9 install ranger
Create a ranger database
CREATE DATABASE IF NOT EXISTS ranger DEFAULT CHARACTER SET utf8;
grant all privileges on *.* to ranger@"%" identified by 'ranger' with grant option;
GRANT ALL PRIVILEGES ON *.* TO 'ranger'@'%';
FLUSH PRIVILEGES;
Click [Add Service] and select Ranger.
Select the RangerAdmin deployment node.
Enter the database root user password, database address, Ranger data user password and other configuration information.
After the installation is successful, you can check the Ranger details on the Kakfa service overview page.
5.10 Install StreamPark
Initialize the StreamPark database.
CREATE DATABASE streampark DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL PRIVILEGES ON streampark.* TO 'streampark'@'%' IDENTIFIED BY 'streampark';
GRANT ALL PRIVILEGES ON streampark.* TO 'streampark'@'localhost' IDENTIFIED BY 'streampark';
flush privileges;
Execute streampark.sql in the /opt/datasophon/DDP/packages directory to create a streampark database table.
use streampark;
source /opt/datasophon/DDP/packages/streampark.sql
Assign the streampark role
and modify the configuration according to the actual situation.
After successful installation, you can view the StreamPark overview page, and jump to the StreamPark user page through WebUi.
There are other components, such as spark/flink/hbase/elasticsearch, etc., which will not be listed here one by one. Interested partners can try it out by themselves.
6 Multi-cluster deployment
DataSophon supports deploying and managing clusters, click to create a new cluster on the management page, and enter the cluster id and other information
Deploying nodes of other clusters also requires the above 2.1-2.5 environment preparations.
6.1 Scanning nodes and distributing installation packages
Enter the node ip where the cluster needs to be installed
Distributing worker installation packages
6.2 Select the service to be installed
6.3 Assign roles
6.4 Confirm service configuration parameters
6.5 Installation service
6.6 Enter cluster 2
After the installation is successful, enter cluster 2 to view the overview page
6.7 Adding services in cluster 2
Add zk/kafka/doris service in cluster 2
zk page overview
kafka page overview
doris page overview
6.8 Inter-cluster switching
You can select a different cluster id to view the node information, component installation information, and alarm information corresponding to different clusters.
The services and nodes installed in different clusters may be inconsistent:
Cluster 1 host management
Cluster 2 host management