Due to an abnormal power outage in the office computer room, the openstack experimental environment cluster can no longer be used normally. Try to use the kolla-ansible tool to restart the cluster.
1. Environment
[root@kolla-ansible-master ~]# cat /etc/centos-release
CentOS Linux release 7.8.2003 (Core)
[root@kolla-ansible-master ~]# ansible --version
ansible 2.7.18
[root@kolla-ansible-master ~]# pip list | grep kolla-ansible
kolla-ansible 7.2.2.dev9
[root@kolla-ansible-master ~]# openstack --version
openstack 5.2.1
2. Records
1. Status
After an incoming call to restart the machine cluster, some openstack containers restart abnormally, and the cluster cannot work normally
kolla-ansible-master:4000/kolla/centos-source-heat-engine:rocky "dumb-init --single-…" 15 months ago Up About a minute heat_engine
c07e5d01adce kolla-ansible-master:4000/kolla/centos-source-heat-api-cfn:rocky "dumb-init --single-…" 15 months ago Restarting (1) 1 second ago heat_api_cfn
88b7a106dcd8 kolla-ansible-master:4000/kolla/centos-source-heat-api:rocky "dumb-init --single-…" 15 months ago Restarting (1) Less than a second ago heat_api
82b5983614e0 kolla-ansible-master:4000/kolla/centos-source-neutron-server:rocky "dumb-init --single-…" 15 months ago Up About a minute neutron_server
feaf96f16403 kolla-ansible-master:4000/kolla/centos-source-nova-compute-ironic:rocky "dumb-init --single-…" 15 months ago Up About a minute nova_compute_ironic
cb9184ff5506 kolla-ansible-master:4000/kolla/centos-source-nova-novncproxy:rocky "dumb-init --single-…" 15 months ago Up About a minute nova_novncproxy
17bf7758070d kolla-ansible-master:4000/kolla/centos-source-nova-consoleauth:rocky "dumb-init --single-…" 15 months ago Up About a minute nova_consoleauth
619d66b56612 kolla-ansible-master:4000/kolla/centos-source-nova-conductor:rocky "dumb-init --single-…" 15 months ago Up About a minute nova_conductor
249b423c2728 kolla-ansible-master:4000/kolla/centos-source-nova-scheduler:rocky "dumb-init --single-…" 15 months ago Up About a minute nova_scheduler
beace5f229e2 kolla-ansible-master:4000/kolla/centos-source-nova-api:rocky "dumb-init --single-…" 15 months ago Restarting (1) 5 seconds ago nova_api
2. Check the problem
Check the logs and containers and find that nova-api is abnormal and keeps reloading
“Restarting (1) 5 seconds ago nova_api”,
And the services under it are running normally.
3. Try to fix
3.1 Stop the virtual machine Server
[root@kolla-ansible-master ~]# openstack server list
+--------------------------------------+-------+--------+-----------------------+--------+---------+
| ID | Name | Status | Networks | Image | Flavor |
+--------------------------------------+-------+--------+-----------------------+--------+---------+
| b4634124-a315-4fd8-aa4a-3df8cade2335 | demo1 | ACTIVE | demo-net=192.168.19.8 | cirros | m1.tiny |
+--------------------------------------+-------+--------+-----------------------+--------+---------+
[root@kolla-ansible-master ~]# openstack server stop demo1
3.2 Stop Nova service
[root@kolla-ansible-master ~]# kolla-ansible -i ./multinode05 stop --tags nova
Stop Kolla containers : ansible-playbook -i ./multinode05 -e @/etc/kolla/globals.yml -e @/etc/kolla/passwords.yml -e CONFIG_DIR=/etc/kolla --tags nova /usr/share/kolla-ansible/ansible/stop.yml
PLAY [all] ******************************************************************************************************************************************************************************************************
TASK [Gathering Facts] ******************************************************************************************************************************************************************************************
ok: [localhost]
ok: [compute01]
ok: [compute03]
ok: [compute02]
ok: [network01]
ok: [controller01]
PLAY RECAP ******************************************************************************************************************************************************************************************************
compute01 : ok=1 changed=0 unreachable=0 failed=0
compute02 : ok=1 changed=0 unreachable=0 failed=0
compute03 : ok=1 changed=0 unreachable=0 failed=0
controller01 : ok=1 changed=0 unreachable=0 failed=0
localhost : ok=1 changed=0 unreachable=0 failed=0
network01 : ok=1 changed=0 unreachable=0 failed=0
3.3 Restart Nova
[root@kolla-ansible-master ~]# kolla-ansible -i ./multinode05 deploy --tags nova
PLAY RECAP ******************************************************************************************************************************************************************************************************
compute01 : ok=42 changed=0 unreachable=0 failed=0
compute02 : ok=42 changed=0 unreachable=0 failed=0
compute03 : ok=42 changed=0 unreachable=0 failed=0
controller01 : ok=56 changed=2 unreachable=0 failed=0
localhost : ok=2 changed=0 unreachable=0 failed=0
network01 : ok=2 changed=0 unreachable=0 failed=0
4.4 Restart the virtual machine
[root@kolla-ansible-master ~]# openstack server list
+--------------------------------------+-------+---------+-----------------------+--------+---------+
| ID | Name | Status | Networks | Image | Flavor |
+--------------------------------------+-------+---------+-----------------------+--------+---------+
| b4634124-a315-4fd8-aa4a-3df8cade2335 | demo1 | SHUTOFF | demo-net=192.168.19.8 | cirros | m1.tiny |
+--------------------------------------+-------+---------+-----------------------+--------+---------+
[root@kolla-ansible-master ~]# openstack server start demo1
[root@kolla-ansible-master ~]#
Written at the end: In the production environment, the probability of abnormal power failure is extremely small, and the daily routine is to replace a certain device or host. In the experimental environment, it can also be redeployed. Here is only a way to repair the cluster.