Kolla-Ansible安装OpenStack Ocata版本部署成功但是Ceph osd没有启动

Kolla-Ansible安装OpenStack Ocata版本部署成功但是Ceph osd没有启动

环境配置:

OpenStack版本:Ocata

节点数:4个

各节点宿主操作系统:CentOS7.7

在使用Kolla-Ansible部署OpenStack Ocata版本的时候,可以成功部署,后端存储使用的是ceph,但是发现存储节点上,ceph osd都没有启动,感觉很奇怪,发现日志内容如下:

TASK [ceph : Looking up disks to bootstrap for Ceph OSDs] *********************************************************************************************
 [WARNING]: when statements should not include jinja2 templating delimiters such as {
    
    {
    
     }} or {
    
    % %}. Found: {
    
    {
    
     osd_lookup.stdout.find('localhost |
SUCCESS => ') != -1 and (osd_lookup.stdout.split('localhost | SUCCESS => ')[1]|from_json).changed }}

ok: [Compute01]
 [WARNING]: when statements should not include jinja2 templating delimiters such as {
    
    {
    
     }} or {
    
    % %}. Found: {
    
    {
    
     osd_lookup.stdout.find('localhost |
SUCCESS => ') != -1 and (osd_lookup.stdout.split('localhost | SUCCESS => ')[1]|from_json).changed }}

ok: [Compute03]
 [WARNING]: when statements should not include jinja2 templating delimiters such as {
    
    {
    
     }} or {
    
    % %}. Found: {
    
    {
    
     osd_lookup.stdout.find('localhost |
SUCCESS => ') != -1 and (osd_lookup.stdout.split('localhost | SUCCESS => ')[1]|from_json).changed }}

ok: [Compute02]

TASK [ceph : Parsing disk info for Ceph OSDs] *********************************************************************************************************
ok: [Compute01]
ok: [Compute02]
ok: [Compute03]

TASK [ceph : Looking up disks to bootstrap for Ceph Cache OSDs] ***************************************************************************************
 [WARNING]: when statements should not include jinja2 templating delimiters such as {
    
    {
    
     }} or {
    
    % %}. Found: {
    
    {
    
     osd_cache_lookup.stdout.find('localhost
| SUCCESS => ') != -1 and (osd_cache_lookup.stdout.split('localhost | SUCCESS => ')[1]|from_json).changed }}

 [WARNING]: when statements should not include jinja2 templating delimiters such as {
    
    {
    
     }} or {
    
    % %}. Found: {
    
    {
    
     osd_cache_lookup.stdout.find('localhost
| SUCCESS => ') != -1 and (osd_cache_lookup.stdout.split('localhost | SUCCESS => ')[1]|from_json).changed }}

ok: [Compute01]
ok: [Compute03]
 [WARNING]: when statements should not include jinja2 templating delimiters such as {
    
    {
    
     }} or {
    
    % %}. Found: {
    
    {
    
     osd_cache_lookup.stdout.find('localhost
| SUCCESS => ') != -1 and (osd_cache_lookup.stdout.split('localhost | SUCCESS => ')[1]|from_json).changed }}

ok: [Compute02]

...

在查找待启动的Ceph OSDs的剧本成功了,其中并没有错误,只有一些警告,开始以为是警告导致的原因,于是苦苦查找半天没有结果。

最后,准备换一个思路,重新进行了deploy,通过添加参数-vvv打印详细日志,在详细日志中排查找到如下内容:

TASK [ceph : Looking up disks to bootstrap for Ceph OSDs] *********************************************************************************************
...
中间省略几十行
...
ok: [Compute01] => {
    
    
    "changed": false,
    "cmd": [
        "docker",
        "exec",
        "-t",
        "kolla_toolbox",
        "sudo",
        "-E",
        "/usr/bin/ansible",
        "localhost",
        "-m",
        "find_disks",
        "-a",
        "partition_name='KOLLA_CEPH_OSD_BOOTSTRAP' match_mode='prefix' use_udev=True"
    ],
    "delta": "0:00:01.029759",
    "end": "2020-03-11 14:37:57.020245",
    "failed": false,
    "failed_when_result": false,
    "invocation": {
    
    
        "module_args": {
    
    
            "_raw_params": "docker exec -t kolla_toolbox sudo -E /usr/bin/ansible localhost -m find_disks -a \"partition_name='KOLLA_CEPH_OSD_BOOTSTRAP' match_mode='prefix' use_udev=True\"",
            "_uses_shell": false,
            "chdir": null,
            "creates": null,
            "executable": null,
            "removes": null,
            "warn": true
        }
    },
    "rc": 0,
    "start": "2020-03-11 14:37:55.990486",
    "stderr": "",
    "stderr_lines": [],
    "stdout": "localhost | SUCCESS => {\r\n    \"changed\": false, \r\n    \"disks\": \"[]\"\r\n}",
    "stdout_lines": [
            "localhost | SUCCESS => {",
        "    \"changed\": false, ",
        "    \"disks\": \"[]\"",
        "}"
    ]
}
...

从打印出的详细结果json串中可以发现,输出的disks是空数组[]。即表示可以作为osds的盘kolla并没有找到。但是在每个计算(存储)节点上,确实已经有硬盘准备好了做为osds,只能是在osd硬盘准备的过程中出现了问题。

根据官方文档中对ceph磁盘的准备工作,可以确定总计分为两种,一种是不单独设置journal,一种是单独设置journal。我这里采用的是讲journal交由ceph自己创建,不独立指定硬盘。应该用到的命令如下:

parted /dev/sdb -s -- mklabel gpt mkpart KOLLA_CEPH_OSD_BOOTSTRAP 1 -1
parted /dev/sdb print
Model: VMware, VMware Virtual S (scsi)
Disk /dev/sdb: 10.7GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Number  Start   End     Size    File system  Name                      Flags
     1      1049kB  10.7GB  10.7GB               KOLLA_CEPH_OSD_BOOTSTRAP

这里关键处在于对磁盘打标签时候,标签的名称是KOLLA_CEPH_OSD_BOOTSTRAP,这里的File system根据系统的不同,可能出来的效果不一样,但是没关系,osd启动成功后,经过ceph处理后,都会变成Ceph OSD这个类型。这个标签名称我在配置的时候,加上了后缀,设置成了KOLLA_CEPH_OSD_BOOTSTRAP1这样,但是在不使用journal的时候,这样设置会导致kolla检测ceph osd磁盘的时候,将这些标签的磁盘忽略掉,所以才会出现没有OSD盘的问题。

而在官方文档中,独立配置journal盘的时候,必须要对标签加后缀,用以区分不同的盘和对应的journal盘。说明如下:

Prepare the storage drive in the same way as documented above:

# <WARNING ALL DATA ON $DISK will be LOST!>
# where $DISK is /dev/sdb or something similar
parted $DISK -s -- mklabel gpt mkpart KOLLA_CEPH_OSD_BOOTSTRAP_FOO 1 -1

To prepare the journal external drive execute the following command:

# <WARNING ALL DATA ON $DISK will be LOST!>
# where $DISK is /dev/sdc or something similar
parted $DISK -s -- mklabel gpt mkpart KOLLA_CEPH_OSD_BOOTSTRAP_FOO_J 1 -1
Note Use different suffixes (_42, _FOO, _FOO42, ..) to use different external journal drives for different storage drives. One external journal drive can only be used for one storage drive.

Note The partition labels KOLLA_CEPH_OSD_BOOTSTRAP and KOLLA_CEPH_OSD_BOOTSTRAP_J are not working when using external journal drives. It is required to use suffixes (_42, _FOO, _FOO42, ..). If you want to setup only one storage drive with one external journal drive it is also necessary to use a suffix.

所以,在这两种情况下,对磁盘标签的处理一定要对应上,不能混淆,否则将会导致OSD无法启动。

至于为什么在不使用独立的journal时,标签加后缀会导致osd起不来,可以从上边的日志中看出端倪。如下命令:

docker exec -t kolla_toolbox sudo -E /usr/bin/ansible localhost -m find_disks -a \"partition_name='KOLLA_CEPH_OSD_BOOTSTRAP' match_mode='prefix' use_udev=True\"

其实,kolla在部署的时候,是通过在各个节点上的kolla_toolbox这个容器中执行命令,来实现相应的操作的。这里执行了一个find_disks的命令,这个命令对应到kolla_toolbox中的一个pyhon文件,内容如下:

#!/usr/bin/python

# Copyright 2015 Sam Yaple
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# This module has been relicensed from the source below:
# https://github.com/SamYaple/yaodu/blob/master/ansible/library/ceph_osd_list

DOCUMENTATION = '''
---
module: find_disks
short_description: Return list of devices containing a specfied name or label
description:
     - This will return a list of all devices with either GPT partition name
       or filesystem label of the name specified.
options:
  match_mode:
    description:
      - Label match mode, either strict or prefix
    default: 'strict'
    required: False
    choices: [ "strict", "prefix" ]
    type: str
  name:
    description:
      - Partition name or filesystem label
    required: True
    type: str
    aliases: [ 'partition_name' ]
  use_udev:
    description:
      - When True, use Linux udev to read disk info such as partition labels,
        uuid, etc.  Some older host operating systems have issues using udev to
        get the info this module needs. Set to False to fall back to more low
        level commands such as blkid to retrieve this information. Most users
        should not need to change this.
    default: True
    required: False
    type: bool
author: Sam Yaple
'''

EXAMPLES = '''
- hosts: ceph-osd
  tasks:
    - name: Return all valid formated devices with the name KOLLA_CEPH_OSD
      find_disks:
          name: 'KOLLA_CEPH_OSD'
      register: osds

- hosts: swift-object-server
  tasks:
    - name: Return all valid devices with the name KOLLA_SWIFT
      find_disks:
          name: 'KOLLA_SWIFT'
      register: swift_disks

- hosts: swift-object-server
  tasks:
    - name: Return all valid devices with wildcard name 'swift_d*'
      find_disks:
          name: 'swift_d' match_mode: 'prefix'
      register: swift_disks
'''

import json
import pyudev
import re
import subprocess  # nosec


def get_id_part_entry_name(dev, use_udev):
    if use_udev:
        dev_name = dev.get('ID_PART_ENTRY_NAME', '')
    else:
        part = re.sub(r'.*[^\d]', '', dev.device_node)
        parent = dev.find_parent('block').device_node
        # NOTE(Mech422): Need to use -i as -p truncates the partition name
        out = subprocess.Popen(['/usr/sbin/sgdisk', '-i', part,  # nosec
                                parent],
                               stdout=subprocess.PIPE).communicate()
        match = re.search(r'Partition name: \'(\w+)\'', out[0])
        if match:
            dev_name = match.group(1)
        else:
            dev_name = ''
    return dev_name


def get_id_fs_uuid(dev, use_udev):
    if use_udev:
        id_fs_uuid = dev.get('ID_FS_UUID', '')
    else:
        out = subprocess.Popen(['/usr/sbin/blkid', '-o', 'export',  # nosec
                                dev.device_node],
                               stdout=subprocess.PIPE).communicate()
        match = re.search(r'\nUUID=([\w-]+)', out[0])
        if match:
            id_fs_uuid = match.group(1)
        else:
            id_fs_uuid = ''
    return id_fs_uuid


def is_dev_matched_by_name(dev, name, mode, use_udev):
    if dev.get('DEVTYPE', '') == 'partition':
        dev_name = get_id_part_entry_name(dev, use_udev)
    else:
        dev_name = dev.get('ID_FS_LABEL', '')

    if mode == 'strict':
        return dev_name == name # 必须相等,这里name为传入参数,为KOLLA_CEPH_OSD_BOOTSTRAP
    elif mode == 'prefix':
        return dev_name.startswith(name)
    else:
        return False


def find_disk(ct, name, match_mode, use_udev):
    for dev in ct.list_devices(subsystem='block'):
        if is_dev_matched_by_name(dev, name, match_mode, use_udev):
            yield dev


def extract_disk_info(ct, dev, name, use_udev):
    if not dev:
        return
    kwargs = dict()
    kwargs['fs_uuid'] = get_id_fs_uuid(dev, use_udev)
    kwargs['fs_label'] = dev.get('ID_FS_LABEL', '')
    if dev.get('DEVTYPE', '') == 'partition':
        kwargs['device'] = dev.find_parent('block').device_node
        kwargs['partition'] = dev.device_node
        kwargs['partition_num'] = \
            re.sub(r'.*[^\d]', '', dev.device_node)
        if is_dev_matched_by_name(dev, name, 'strict', use_udev): #这里对标签进行严格检查,严格检查必须完全匹配,可以到is_dev_matched_by_name函数中看注释,name即为KOLLA_CEPH_OSD_BOOTSTRAP
            kwargs['external_journal'] = False  # 这种情况下,不使用用外部journal
            kwargs['journal'] = dev.device_node[:-1] + '2'
            kwargs['journal_device'] = kwargs['device']
            kwargs['journal_num'] = 2
        else:  # 这里则匹配到以KOLLA_CEPH_OSD_BOOTSTRAP开头的标签
            kwargs['external_journal'] = True  # 使用外部journal
            journal_name = get_id_part_entry_name(dev, use_udev) + '_J'  # 外部journal盘则在osd盘结尾接啊_J,还要严格匹配
            for journal in find_disk(ct, journal_name, 'strict', use_udev):
                kwargs['journal'] = journal.device_node
                kwargs['journal_device'] = \
                    journal.find_parent('block').device_node
                kwargs['journal_num'] = \
                    re.sub(r'.*[^\d]', '', journal.device_node)
                break
            if 'journal' not in kwargs:  # 最终我的配置是进入了这个分支,直接return了
                # NOTE(SamYaple): Journal not found, not returning info
                return
    else:
        kwargs['device'] = dev.device_node
    yield kwargs


def main():
    argument_spec = dict(
        match_mode=dict(required=False, choices=['strict', 'prefix'],
                        default='strict'),  # 传入的参数,match_mode='prefix'
        name=dict(aliases=['partition_name'], required=True, type='str'), # partition_name='KOLLA_CEPH_OSD_BOOTSTRAP'
        use_udev=dict(required=False, default=True, type='bool')
    )
    module = AnsibleModule(argument_spec)
    match_mode = module.params.get('match_mode')  # 实为prefix
    name = module.params.get('name')  # 实为KOLLA_CEPH_OSD_BOOTSTRAP
    use_udev = module.params.get('use_udev')

    try:
        ret = list()
        ct = pyudev.Context()
        for dev in find_disk(ct, name, match_mode, use_udev):  # 这里初步筛选,匹配模式是前缀模式,只要是以KOLLA_CEPH_OSD_BOOTSTRAP开头的标签盘,都会命中
            for info in extract_disk_info(ct, dev, name, use_udev):  # 这个函数中会根据标签的形式,来区分是否是使用外部journal盘
                if info:
                    ret.append(info)

        module.exit_json(disks=json.dumps(ret))
    except Exception as e:
        module.exit_json(failed=True, msg=repr(e))

# import module snippets
from ansible.module_utils.basic import *  # noqa
if __name__ == '__main__':
    main()

通过上边的代码分析,可以看出,find_disks.py本质上是通过标签名称来区别是使用外部journal还是使用osd盘一部分作为journal分区。所以,根据这种设定,我在准备osd盘的时候,在打标签时添加了后缀,肯定会被误认为是有外部journal盘,但是又找不到,所以直接就return了。最终才会返回结果是空数组。当我把各个节点上的osd盘标签都改为KOLLA_CEPH_OSD_BOOTSTRAP以后,重新deploy,则各个osd都顺利创建出来了。

猜你喜欢

转载自blog.csdn.net/stpice/article/details/104955726