Kolla-Ansible安装OpenStack Ocata版ceph osd配置拷贝出错导致nova-compute启动超时失败的问题

Kolla-Ansible安装OpenStack Ocata版ceph osd配置拷贝失败导致nova-compute启动超时失败的问题

环境配置:

OpenStack版本:Ocata

节点数:4个

各节点宿主操作系统:CentOS7.7

使用Kolla-Ansible安装OpenStack,Ocata版本,进行多节点部署时候,遇到等待计算节点nova启动的时候,出现了超时失败,最终部署终止。错误信息如下:

TASK [nova : Waiting for nova-compute service up] *****************************************************************
FAILED - RETRYING: Waiting for nova-compute service up (20 retries left).
...
FAILED - RETRYING: Waiting for nova-compute service up (1 retries left).
fatal: [172.30.220.3 -> 172.30.220.3]: FAILED! => {
    
    "attempts": 20, "changed": false, "cmd": ["docker", "exec", "kolla_toolbox", "openstack", "--os-interface", "internal", "--os-auth-url", "http://172.30.230.3:35357", "--os-identity-api-version", "3", "--os-project-domain-name", "default", "--os-tenant-name", "admin", "--os-username", "admin", "--os-password", "3PxtKnvjKDTbPg2QT3llwig08efLoAgkdEY5VVoY", "--os-user-domain-name", "default", "compute", "service", "list", "-f", "json", "--service", "nova-compute"], "delta": "0:00:02.405043", "end": "2017-10-04 16:21:35.742909", "failed": true, "rc": 0, "start": "2017-10-04 16:21:33.337866", "stderr": "", "stderr_lines": [], "stdout": "[]", "stdout_lines": ["[]"]}

往前翻日志,发现还有一处错误,但是部署并没有因为这个错误终止,错误内容如下:

TASK [ceph : Copying over config.json files for services] ******************************************************************************************************
ok: [Controller01] => (item=ceph-mon)
ok: [Compute01] => (item=ceph-mon)
ok: [Compute02] => (item=ceph-mon)
ok: [Compute03] => (item=ceph-mon)
failed: [Compute02] (item=ceph-osd) => {
    
    "failed": true, "item": "ceph-osd", "msg": "AnsibleUndefinedVariable: 'dict object' has no attribute 'ipv4'"}
failed: [Compute01] (item=ceph-osd) => {
    
    "failed": true, "item": "ceph-osd", "msg": "AnsibleUndefinedVariable: 'dict object' has no attribute 'ipv4'"}
failed: [Compute03] (item=ceph-osd) => {
    
    "failed": true, "item": "ceph-osd", "msg": "AnsibleUndefinedVariable: 'dict object' has no attribute 'ipv4'"}
ok: [Controller01] => (item=ceph-osd)
ok: [Compute02] => (item=ceph-rgw)
ok: [Compute01] => (item=ceph-rgw)
ok: [Compute03] => (item=ceph-rgw)
ok: [Controller01] => (item=ceph-rgw)

这个错误很奇怪,网上查找很久也没有专门应对这个问题的解决方案,但是从问题的错误信息可以看出,可能是由于网络方面的问题导致的。后来尝试各种方法之后,发现各个节点上的NetworkManager服务还在运行,于是禁用了各个节点上的NetworkManager服务,再次重试部署后,上边的问题就不会出现了,计算节点的nova-compute也可以正常启动了。

后来查阅资料,OpenStack官方不推荐使用firewalldNetworkManager,因为OpenStack使用iptables会和firewalld冲突,NetworkManager运行的话,会对网络进行自动配置,而neutron`无法察觉到,会导致奇怪的现象出现。

猜你喜欢

转载自blog.csdn.net/stpice/article/details/104954238