https://www.cnblogs.com/chaofan-/p/11715402.html
An error is reported during deployment:
TASK [mariadb : Check MariaDB service port liveness] **********************************************************************************************************************
fatal: [ALLInOne-Kolla]: FAILED! => {"changed": false, "elapsed": 10, "msg": "Timeout when waiting for search string MariaDB in 192.168.23.102:3306"}
RUNNING HANDLER [mariadb : Wait for first MariaDB service port liveness]
FAILED - RETRYING: Wait for first MariaDB service port liveness (10 retries left).
检查并清理docker volume
先检查
docker volume list
然后再删除mariadb volume
删除volume时会报错
Error response from daemon: remove mariadb: volume is in use
处理方式
docker container prune
docker volume prune
更多处理方式详见
https://stackoverflow.com/questions/34658836/docker-is-in-volume-in-use-but-there-arent-any-docker-containers
Error during deployment
问题:the output has been hidden due to the fact that 'no_log: true'
fatal: [localhost]: FAILED! => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result", "changed": true}
Destroy all redeploy. .
重新部署
When destroying and redeploying, an error occurred: MariaDB cluster exists but is stopped
检查出错节点的docker volume 是否没有清理干净
Error 1: When checking the environment, no module named docker
pip install docker
Error 2: During environment check, TASK error is reported during nova task [nova: Ensuring config directories exist]
"The conditional check 'inventory_hostname in groups[item.value.group]' failed,The error appears to be in '/usr/share/kolla-ansible/ansible/roles/nova/tasks/config.yml': line 14
Try to clear the cache and execute again
kolla-ansible destroy -i ./all-in-one --yes-i-really-really-mean-it
错误依然,查看kolla-ansible/ansible/roles/nova/tasks/config.yml,第14行的作用:创建neutron各服务的配置文件目录,查看/etc/kolla下自动启动的组件有相应文件夹及配置文件,因此手动创建
mkdir /etc/kolla/nova
touch /etc/kolla/nova/nova.conf
问题的根本原因是pip安装的kolla-ansible与克隆的代码的all-in-one不匹配
需要手动安装kolla-ansible
python kolla-ansible/setup.py install
Error 3: An error is reported when creating an instance on the dashboard, but the command line can be created
Error:Failed to perform requested operation on instance "test3", the instance has an error status: Please try again later [Error: Build of instance 5f875c14-2050-42c2-a749-09ec2162e68c aborted: Volume dfc5e50e-9034-4467-ab67-091362f18309 did not finish being created even after we waited 0 seconds or 1 attempts. And its status is error.].
查看nova日志
cat /var/log/kolla/nova/nova-compute.log
Instance failed block device setup: VolumeNotCreated
The error is that the storage volume is not created, indicating that it may be a cinder problem
View cinder log
tail -f /var/log/kolla/cinder/cinder-volume.log
ERROR cinder.cmd.volume [-] Configuration for cinder-volume does not specify "enabled_backends". Using DEFAULT section to configure drivers is not supported since Ocata.
View volume status
openstack volume service list
cinder-backup\cinder-volume:down
发现cinder-volume服务的状态为:down,猜测问题的根本原因就在这里
解决方法:创建cinder-volumes卷组
1. dd if=/dev/zero of=./disk.img count=200 bs=512MB
2. losetup -f
3. losetup /dev/loop0 disk.img
4. pvcreate /dev/loop0
5. vgcreate cinder-volumes /dev/loop0
创建好卷组,重新部署,查看cinder-volume的状态为up
Mistake 4: Login permission issues
When using the command line operation, the first login operation will display the following error
Missing value auth-url required for auth plugin password
Run the certificate file
. /etc/kolla/admin-openrc.sh
Mistake 5: Create a mirror after glance connects to ceph, the status is always: queued, check the glance log
ERROR glance.common.wsgi ObjectNotFound: [errno 2] error calling conf_read_file
The creation of a mirror has been queued, indicating that the data cannot be written, and the log information indicates that an error occurred while reading the configuration file by glance
After modifying the configuration file, there are other errors in the log, but it means that the configuration file can be read:
1. glance_store._drivers.rbd ObjectNotFound: [errno 2] error connecting to the cluster
2. Failed to upload image data due to internal error: BackendException
The error shows that glance cannot connect to the cluster. It is guessed that it is caused by the failure of ceph authentication.
Solution:
Copy /etc/ceph/ceph.client.admin.keyring into the glance_api container under /etc/ceph
Mistake 6: After connecting to ceph and creating a virtual machine, the instance fails to start, displaying: no bootable device
上传镜像的格式问题,需转为raw格式
Mistake 7: Problems after changing the mount disk
After changing the docker mount disk, some containers have been restarting. After clearing all the containers and redeploying, some containers are still restarting. It is guessed that it is related to mirroring. Execute the following command to clear all mirror files.
docker rmi $(docker images -q)
Reload image
docker load -i super.tar
Re-execute the deployment
Mistake 8: Mariadb timed out waiting for VIP
检查/etc/kolla/globals.yml文件,将kolla_internal_vip_address配置项注释掉
Mistake 9: no module named decorate
Need to upgrade package
pip install -U decorate
Mistake 10: When installing the openstack CLI client
ERROR: Cannot uninstall'pyOpenSSL', execute the following command to install
pip install python-openstackclient --ignore-installed pyOpenSSl
Mistake 11: Gather facts freezes for a long time during environmental inspection
重启机器自然解决
Error 12: Failed to delete mirror, glance log shows permission error
原因:镜像中有一个snapshot,ceph对于snapshot是保护的,需要先删除snapshot,才能删除镜
Error 13: When generating a password, an error is reported:'module' object has no attribute'_ssl write_string'
根据提示检查cryptography是否安装了两个版本,彻底卸载重装
Error 14: Check the environment, check libvirt is not running
需要libvirt是关闭的状态
systemctl libvirtd stop
Error 15: Timeout for instance creation, check log: Stashed volume connector host is localhost.: BuildAbortException
The guess is that the volume group has insufficient space
# 查看卷组空间
vgs
1. dd if=/dev/zero of=./disk.img count=200 bs=512MB
2. losetup -f
3. losetup /dev/loop1 disk.img
4. pvcreate /dev/loop1
# 将新建pv扩展至已存在的卷组
5. vgextend cinder-volumes /dev/loop1
# 重启容器
docker ps -a | grep kolla | awk '{print $1}' | xargs docker restart
Error 16: Error during environmental check
ERROR! An unhandled exception occurred while templating '{
{ neutron_tenant_network_types.replace(' ', '').split(',') | reject('equalto', '') | list }}'. Error was a <class 'jinja2.exceptions.TemplateRuntimeError'>, original message: no test named 'equalto'
解决方法:
pip install jinja2
错误17:Error response from daemon: No such container: mariadb"
A certain node cannot start the mariadb container
After the destruction, check whether the docker still has the volume of mariadb
docker volume ls
docker volume rm [id]
错误18:ERROR: Cannot uninstall 'requests'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall
由于库版本冲突,卸载时有其他库依赖此库,因此卸载失败,直接删除,重装
rm -r /usr/lib/python2.7/site-packages/requests
pip install requests
Error 19: Ceph-related errors occurred in glance or cinder during environment check
检查/etc/kolla/config是否有相应配置文件,以及multinode中是否加入ceph配置
Error 20: After linking ceph, an error occurred when creating a virtual machine, and a secret related error appeared in nova-compute.log
检查/etc/kolla/nova-libvirt/中是否有secret.xml,及uuid是否相对应
<secret ephemeral='no' private='no'>
<uuid>6000a67b-0060-4b92-a6b1-4763fbeb04a7</uuid>
<usage type='ceph'>
<name>client.cinder secret</name>
</usage>
</secret>
创建完secret.xml文件,进入nova-libvirt设置secret的值
virsh secret-define --file secret.xml
virsh secret-set-value --secret 6000a67b-0060-4b92-a6b1-4763fbeb04a7 --base64 $(cat client.cinder.key)
Error 21: Error when rally creating environment: there is no platform pulgin with name existing@openstack
检查rally-openstack及依赖包版本
Error 22: live_migration failed
vi /etc/kolla/nova-libvirt/libvirtd.conf
listen_addr="0.0.0.0"
重启容器
或许secret失效,重新执行上面的virsh secret-define ...