Blue Whale Automated Operation and Maintenance Platform

Blue Whale Automated Operation and Maintenance Platform

1. Introduction to Blue Whale

Official website: https://bk.tencent.com/docs/

Tencent Blue Whale Smart Cloud, referred to as Blue Whale, is a PaaS development framework self-developed by Tencent Interactive Entertainment Group (IEG) for building an integrated enterprise R&D and operation system. It provides aPaaS (DevOps pipeline, Operating environment hosting, front-end and back-end framework) and iPaaS (continuous integration, CMDB, operating platform, container management, data platform, AI and other atomic platforms) and other modules help enterprise technicians to quickly build basic operational PaaS.

2. Blue whale deployment

2.1. Environmental preparation

operating system CPU RAM IP
centos7.5 8-core 6.5G 192.168.81.240

2.2. Close Selinux

[root@localhost ~]# setenforce 0
[root@localhost ~]# sed -ri '/^SELINUX=/c SELINUX=disabled' /etc/selinux/config 
[root@localhost ~]# sed -ri '/^SELINUX=/c SELINUX=disabled' /etc/sysconfig/selinux 

2.3. Turn off the firewall/network configurator

[root@localhost ~]# systemctl stop firewalld.service 
[root@localhost ~]# systemctl disable firewalld.service
Removed symlink /etc/systemd/system/multi-user.target.wants/firewalld.service.
Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
[root@localhost ~]# systemctl stop NetworkManager
[root@localhost ~]# systemctl disable NetworkManager

2.4. Adjust the maximum number of open files

[root@localhost ~]# echo 'root soft nofile 102400' >> /etc/security/limits.d/20-nproc.conf 
[root@localhost ~]# echo 'root hard nofile 102400' >> /etc/security/limits.d/20-nproc.conf
[root@localhost ~]# reboot

2.5. Configure yum warehouse

[root@localhost ~]# curl -o /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo ;curl -o /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo
[root@localhost ~]# ls /etc/yum.repos.d/
CentOS-Base.repo  epel.repo

2.6. Preparing the software package

程序包
https://bk.tencent.com/download/
ssl认证文件
https://bk.tencent.com/download_ssl/

Insert picture description here

2.7. Non-standard IP processing method

The function get_lan_ip of the blue whale community version deployment script to obtain the ip from the following files in the install directory, non-standard address, must be modified before deployment

[root@localhost install]# grep -l 'get_lan_ip *()' -r /data/install/

Insert picture description here

Modification method:

Assuming the server ip is 129.xxx

Insert picture description here

2.8. Install paas, cmdb, job

1)准备安装目录
[root@localhost soft]# mkdir /data

2)解压
[root@localhost soft]# tar xf bkce_src-5.1.28.tar.gz /data
[root@localhost ~]# ls /data/
install  src

3)解压ssl认证
[root@localhost ~]# tar xf /soft/ssl_certificates.tar.gz  -C /data/src/cert/

4)环境检测
[root@localhost install]# ./precheck.sh 
如果报错则使用-r重新检测
[root@localhost install]# ./precheck.sh -r

5)部署组件
如果部署所有组件
[root@localhost install]# ./install_minibk -y
按需部署则
[root@localhost install]# ./install_minibk 
[root@localhost install]# ./install_minibk paas && ./install_minibk cmdb && ./install_minibk job

The pass component is installed successfully
Insert picture description here

The cmdb component is installed successfully

Insert picture description here

The job component is installed successfully

Insert picture description here

2.9. Install app_mgr

[root@rbtnode1 install]# ./bk_install app_mgr

Insert picture description here

2.10. Install pkdata

[root@rbtnode1 install]# ./bk_install bkdata

Insert picture description here

2.11. Install fta

[root@rbtnode1 install]# ./bk_install fta

Insert picture description here

2.12. Install gse_agent

[root@rbtnode1 install]# ./bk_install gse_agent

Insert picture description here

2.13. Install saas-o

[root@rbtnode1 install]# ./bkcec install saas-o

2.14. Install node management after all the above are installed

[root@rbtnode1 install]# ./bk_install saas-o bk_nodeman

Insert picture description here

Insert picture description here

3. Troubleshooting

3.1. Error when installing app_mgr

Insert picture description here

Reason: paas_agent failed to start, and paas was not resolved

Solution:

解析paas
/data/bkce/bin/health_check/check_proc_exists -m paas

查看一下appt的状态
[root@rbtnode1 install]# ./bkcec status appt
[192.168.81.240] paas_agent()    paas_agent                       FATAL     Exited too quickly (process log may have details)
[192.168.81.240] nginx: RUNNING
发现没有启动,启动appt
[root@rbtnode1 install]# ./bkcec start appt
[192.168.81.240]20200616-104319 98   starting appt(ALL) on host: 192.168.81.240
paas_agent: started


3.2. Error when installing bkdata

Insert picture description here

Solution

[root@rbtnode1 install]# /data/bkce/service/zk/bin/zkCli.sh -server zk.service.consul:2181 ls /common_kafka/brokers/ids
Connecting to zk.service.consul:2181
log4j:WARN No appenders could be found for logger (org.apache.zookeeper.ZooKeeper).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
[1]

[root@rbtnode1 ~]# pip install kazoo

3.3. Skills

Insert picture description here

Generally, the phenomenon in the above picture is that the corresponding plug-in is not installed, and the installation can solve the problem

4. Matters needing attention

4.1. Host restart

After the host restarts, you need to manually start a series of modules such as paas, cmdb, job, etc.

First check if it is started, if it is started, use the ./bkcec stop module name to stop in use./bkcec start module name to start

Start paas

[root@rbtnode1 install]# ./bkcec start paas
[192.168.81.240]20200616-205049 98   starting paas(ALL) on host: 192.168.81.240
Unlinking stale socket /data/bkce/logs/open_paas/supervisor.sock

Start cmdb

[root@rbtnode1 install]# ./bkcec stop cmdb
[192.168.81.240]20200616-205617 135   stopping cmdb(ALL) on host: 192.168.81.240
cmdb_hostcontroller: stopped
cmdb_hostserver: stopped
cmdb_toposerver: stopped
cmdb_objectcontroller: stopped
cmdb_webserver: stopped
cmdb_procserver: stopped
cmdb_auditcontoller: stopped
cmdb_apiserver: stopped
cmdb_eventserver: stopped
cmdb_datacollection: stopped
cmdb_adminserver: stopped
cmdb_proccontroller: stopped
Shut down
[root@rbtnode1 install]# ./bkcec start cmdb
[192.168.81.240]20200616-205626 98   starting cmdb(ALL) on host: 192.168.81.240

Start job

[root@rbtnode1 install]# ./bkcec start job
[192.168.81.240]20200616-205129 98   starting job(ALL) on host: 192.168.81.240

Start app_mgr

[root@rbtnode1 install]# ./bkcec status appo
[192.168.81.240] paas_agent()    paas_agent                       RUNNING   pid 19074, uptime 1:40:13
[192.168.81.240] nginx: RUNNING


[root@rbtnode1 install]# ./bkcec status appt
[192.168.81.240] paas_agent()    paas_agent                       RUNNING   pid 19074, uptime 1:41:51
[192.168.81.240] nginx: RUNNING

Start bkdata

[root@rbtnode1 install]# ./bkcec status bkdata

Insert picture description here

Guess you like

Origin blog.csdn.net/weixin_44953658/article/details/114666622