Blue Whale Automated Operation and Maintenance Platform
1. Introduction to Blue Whale
Official website: https://bk.tencent.com/docs/
Tencent Blue Whale Smart Cloud, referred to as Blue Whale, is a PaaS development framework self-developed by Tencent Interactive Entertainment Group (IEG) for building an integrated enterprise R&D and operation system. It provides aPaaS (DevOps pipeline, Operating environment hosting, front-end and back-end framework) and iPaaS (continuous integration, CMDB, operating platform, container management, data platform, AI and other atomic platforms) and other modules help enterprise technicians to quickly build basic operational PaaS.
2. Blue whale deployment
2.1. Environmental preparation
operating system | CPU | RAM | IP |
---|---|---|---|
centos7.5 | 8-core | 6.5G | 192.168.81.240 |
2.2. Close Selinux
[root@localhost ~]# setenforce 0
[root@localhost ~]# sed -ri '/^SELINUX=/c SELINUX=disabled' /etc/selinux/config
[root@localhost ~]# sed -ri '/^SELINUX=/c SELINUX=disabled' /etc/sysconfig/selinux
2.3. Turn off the firewall/network configurator
[root@localhost ~]# systemctl stop firewalld.service
[root@localhost ~]# systemctl disable firewalld.service
Removed symlink /etc/systemd/system/multi-user.target.wants/firewalld.service.
Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
[root@localhost ~]# systemctl stop NetworkManager
[root@localhost ~]# systemctl disable NetworkManager
2.4. Adjust the maximum number of open files
[root@localhost ~]# echo 'root soft nofile 102400' >> /etc/security/limits.d/20-nproc.conf
[root@localhost ~]# echo 'root hard nofile 102400' >> /etc/security/limits.d/20-nproc.conf
[root@localhost ~]# reboot
2.5. Configure yum warehouse
[root@localhost ~]# curl -o /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo ;curl -o /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo
[root@localhost ~]# ls /etc/yum.repos.d/
CentOS-Base.repo epel.repo
2.6. Preparing the software package
程序包
https://bk.tencent.com/download/
ssl认证文件
https://bk.tencent.com/download_ssl/
2.7. Non-standard IP processing method
The function get_lan_ip of the blue whale community version deployment script to obtain the ip from the following files in the install directory, non-standard address, must be modified before deployment
[root@localhost install]# grep -l 'get_lan_ip *()' -r /data/install/
Modification method:
Assuming the server ip is 129.xxx
2.8. Install paas, cmdb, job
1)准备安装目录
[root@localhost soft]# mkdir /data
2)解压
[root@localhost soft]# tar xf bkce_src-5.1.28.tar.gz /data
[root@localhost ~]# ls /data/
install src
3)解压ssl认证
[root@localhost ~]# tar xf /soft/ssl_certificates.tar.gz -C /data/src/cert/
4)环境检测
[root@localhost install]# ./precheck.sh
如果报错则使用-r重新检测
[root@localhost install]# ./precheck.sh -r
5)部署组件
如果部署所有组件
[root@localhost install]# ./install_minibk -y
按需部署则
[root@localhost install]# ./install_minibk
[root@localhost install]# ./install_minibk paas && ./install_minibk cmdb && ./install_minibk job
The pass component is installed successfully
The cmdb component is installed successfully
The job component is installed successfully
2.9. Install app_mgr
[root@rbtnode1 install]# ./bk_install app_mgr
2.10. Install pkdata
[root@rbtnode1 install]# ./bk_install bkdata
2.11. Install fta
[root@rbtnode1 install]# ./bk_install fta
2.12. Install gse_agent
[root@rbtnode1 install]# ./bk_install gse_agent
2.13. Install saas-o
[root@rbtnode1 install]# ./bkcec install saas-o
2.14. Install node management after all the above are installed
[root@rbtnode1 install]# ./bk_install saas-o bk_nodeman
3. Troubleshooting
3.1. Error when installing app_mgr
Reason: paas_agent failed to start, and paas was not resolved
Solution:
解析paas
/data/bkce/bin/health_check/check_proc_exists -m paas
查看一下appt的状态
[root@rbtnode1 install]# ./bkcec status appt
[192.168.81.240] paas_agent() paas_agent FATAL Exited too quickly (process log may have details)
[192.168.81.240] nginx: RUNNING
发现没有启动,启动appt
[root@rbtnode1 install]# ./bkcec start appt
[192.168.81.240]20200616-104319 98 starting appt(ALL) on host: 192.168.81.240
paas_agent: started
3.2. Error when installing bkdata
Solution
[root@rbtnode1 install]# /data/bkce/service/zk/bin/zkCli.sh -server zk.service.consul:2181 ls /common_kafka/brokers/ids
Connecting to zk.service.consul:2181
log4j:WARN No appenders could be found for logger (org.apache.zookeeper.ZooKeeper).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
[1]
[root@rbtnode1 ~]# pip install kazoo
3.3. Skills
Generally, the phenomenon in the above picture is that the corresponding plug-in is not installed, and the installation can solve the problem
4. Matters needing attention
4.1. Host restart
After the host restarts, you need to manually start a series of modules such as paas, cmdb, job, etc.
First check if it is started, if it is started, use the ./bkcec stop module name to stop in use./bkcec start module name to start
Start paas
[root@rbtnode1 install]# ./bkcec start paas
[192.168.81.240]20200616-205049 98 starting paas(ALL) on host: 192.168.81.240
Unlinking stale socket /data/bkce/logs/open_paas/supervisor.sock
Start cmdb
[root@rbtnode1 install]# ./bkcec stop cmdb
[192.168.81.240]20200616-205617 135 stopping cmdb(ALL) on host: 192.168.81.240
cmdb_hostcontroller: stopped
cmdb_hostserver: stopped
cmdb_toposerver: stopped
cmdb_objectcontroller: stopped
cmdb_webserver: stopped
cmdb_procserver: stopped
cmdb_auditcontoller: stopped
cmdb_apiserver: stopped
cmdb_eventserver: stopped
cmdb_datacollection: stopped
cmdb_adminserver: stopped
cmdb_proccontroller: stopped
Shut down
[root@rbtnode1 install]# ./bkcec start cmdb
[192.168.81.240]20200616-205626 98 starting cmdb(ALL) on host: 192.168.81.240
Start job
[root@rbtnode1 install]# ./bkcec start job
[192.168.81.240]20200616-205129 98 starting job(ALL) on host: 192.168.81.240
Start app_mgr
[root@rbtnode1 install]# ./bkcec status appo
[192.168.81.240] paas_agent() paas_agent RUNNING pid 19074, uptime 1:40:13
[192.168.81.240] nginx: RUNNING
[root@rbtnode1 install]# ./bkcec status appt
[192.168.81.240] paas_agent() paas_agent RUNNING pid 19074, uptime 1:41:51
[192.168.81.240] nginx: RUNNING
Start bkdata
[root@rbtnode1 install]# ./bkcec status bkdata