zstack故障处理过程

故障描述

  在端午节放假来了以后,通过云主机新建镜像,通过新建的镜像模板去新建一台云主机,报错。然后重建镜像服务器,重建镜像都不行,问题都是一样。

错误描述

请求: {
    "type": "post",
    "path": "longjobs",
    "body": {
        "params": {
            "jobName": "APICreateRootVolumeTemplateFromRootVolumeMsg",
            "jobData": "{\"name\":\"Oracle11gClientDemo\",\"description\":\"\",\"platform\":\"Linux\",\"backupStorageUuids\":[\"7b76ec960040427eb2d4673e9dace169\"],\"resourceUuid\":\"9ec8b2dad9623fe0b50d7bd458549ce7\",\"rootVolumeUuid\":\"7ae63e52f3af45c3909d225387e99a6a\"}"
        }
    },
    "sessionUuid": "99c3730f5e7c47789b6c1d85debb1f74",
    "jobUuid": "9ec8b2dad9623fe0b50d7bd458549ce7"
}
返回: {
    "inventory": {
        "uuid": "5d31f73c55964a61a09f02cc13557553",
        "name": "APICreateRootVolumeTemplateFromRootVolumeMsg",
        "apiId": "9ec8b2dad9623fe0b50d7bd458549ce7",
        "jobName": "APICreateRootVolumeTemplateFromRootVolumeMsg",
        "jobData": "{\"name\":\"Oracle11gClientDemo\",\"description\":\"\",\"backupStorageUuids\":[\"7b76ec960040427eb2d4673e9dace169\"],\"rootVolumeUuid\":\"7ae63e52f3af45c3909d225387e99a6a\",\"platform\":\"Linux\",\"system\":false,\"resourceUuid\":\"9ec8b2dad9623fe0b50d7bd458549ce7\",\"session\":{\"uuid\":\"99c3730f5e7c47789b6c1d85debb1f74\",\"accountUuid\":\"36c27e8ff05c4780bf6d2fa65700f22e\",\"userUuid\":\"36c27e8ff05c4780bf6d2fa65700f22e\",\"expiredDate\":\"Jun 20, 2018 12:27:06 PM\",\"createDate\":\"Jun 20, 2018 10:27:06 AM\"},\"timeout\":-1,\"headers\":{},\"id\":\"5c7036c2bbd948e9b64c33c1950a013b\",\"serviceId\":\"storage.backup.imagestore.774bbff9b1504cbcad616ca4282519cf\",\"createdTime\":1529465672420}",
        "jobResult": "Failed : {\"causes\":[],\"code\":\"SYS.1006\",\"description\":\"An operation failed\",\"details\":\"在所有镜像服务器上从根云盘[uuid:7ae63e52f3af45c3909d225387e99a6a]创建镜像失败,查看错误原因\",\"cause\":{\"causes\":[],\"code\":\"SYS.1006\",\"description\":\"An operation failed\",\"details\":\"failed to create template from root volume[uuid:7ae63e52f3af45c3909d225387e99a6a] on primary storage[uuid:f703c13570354025875bb9c07dbd4471]\",\"cause\":{\"causes\":[],\"code\":\"SYS.1006\",\"description\":\"An operation failed\",\"details\":\"无法从本地存储[uuid:f703c13570354025875bb9c07dbd4471, path:/zstack_ps2/templateWorkspace/image-9ec8b2dad9623fe0b50d7bd458549ce7/9ec8b2dad9623fe0b50d7bd458549ce7.qcow2]上传数据到镜像仓库[主机名:172.16.23.5],因为failed to execute shell command: /usr/local/zstack/imagestore/bin/zstcli -rootca /var/lib/zstack/imagestorebackupstorage/package/certs/ca.pem -json  -callbackurl http://172.16.23.6:8080/zstack/asyncrest/callback -taskid b91e06fdcb884fdaa666a27e16520309 -imageUuid 9ec8b2dad9623fe0b50d7bd458549ce7 add -desc '{\\\"name\\\":\\\"Oracle11gClientDemo\\\",\\\"desc\\\":\\\"\\\",\\\"mediaType\\\":\\\"RootVolumeTemplate\\\",\\\"platform\\\":\\\"Linux\\\",\\\"format\\\":\\\"qcow2\\\",\\\"actualSize\\\":19362873344}' -file /zstack_ps2/templateWorkspace/image-9ec8b2dad9623fe0b50d7bd458549ce7/9ec8b2dad9623fe0b50d7bd458549ce7.qcow2\\nreturn code: 127\\nstdout: \\nstderr: /bin/bash: /usr/local/zstack/imagestore/bin/zstcli: No such file or directory\\n\"}}}",
        "state": "Failed",
        "managementNodeUuid": "774bbff9b1504cbcad616ca4282519cf",
        "createDate": "Jun 20, 2018 11:34:32 AM",
        "lastOpDate": "Jun 20, 2018 11:34:32 AM"
    },
    "jobUuid": "9ec8b2dad9623fe0b50d7bd458549ce7",
    "success": false,
    "sessionUuid": "99c3730f5e7c47789b6c1d85debb1f74"
}

在管理节点点6上看,确实没有这个文件。只有在镜像服务器点5上有,其它6个节点都没有。

后来经过排查,发现7个节点都是管理节点。安装的时候,通过镜像选的只有两个节点选的安装管理节点,其它都安装的是计算节点。后来经过询问,发现有个同事在所有的节点上都执行了:

bash zstack-installer.bin

命令。手工的把所有节点都变成了管理节点。

至于为什么都是管理节点,就导致了丢文件

/usr/local/zstack/imagestore/bin/zstcli

这个逻辑没搞清楚。

解决办法:

1、清理管理节点对应相关文件:

zstack-ctl stop

rm -rf /usr/local/zstack

在除了点6(点6是我计划要搞的管理节点)以外的其它节点执行上述两个命令

2、然后执行小面的命令,生成/usr/local/zstack/imagestore目录以及里面对应的文件。

bash /var/lib/zstack/imagestorebackupstorage/package/zstack-store.bin

然后就可以了。


猜你喜欢

转载自blog.csdn.net/kadwf123/article/details/80744013