Ceph entry to proficiency - stderr raise RuntimeError('Unable to create a new OSD id')

/bin/podman: stderr raise RuntimeError('Unable to create a new OSD id')

podman ps |grep osd.0

podman stop osd.0 container id

Re-add osd.0

   cluster directory
   cd /var/lib/ceph/e8cde810-e4b8-11ed-9ba8-98039b976596/
 1109  ls
 1110  rm -rf osd.0
 1111  ceph orch daemon add osd --method raw ceph02:/dev/sdb
 1112  ls
 1113  dd if=/dev/zero of=/dev/sdb count=2048 bs=1M
 1114  ceph orch daemon add osd --method raw ceph02:/dev/sdb

/bin/podman: stderr RuntimeError: Unable to create a new OSD id

1932490 – [cephadm] 5.0 - New OSD device to the cluster is not getting the OSD IDs - Unable to allocate new IDs to the new OSD device

[root@ceph02 e8cde810-e4b8-11ed-9ba8-98039b976596]# ceph orch daemon add osd --method raw ceph02:/dev/sdb
Error EINVAL: Traceback (most recent call last):
  File "/usr/share/ceph/mgr/mgr_module.py", line 1446, in _handle_command
    return self.handle_command(inbuf, cmd)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 171, in handle_command
    return dispatch[cmd['prefix']].call(self, cmd, inbuf)
  File "/usr/share/ceph/mgr/mgr_module.py", line 414, in call
    return self.func(mgr, **kwargs)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 107, in <lambda>
    wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args, **l_kwargs) # noqa: E731
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 96, in wrapper
    return func(*args, **kwargs)
  File "/usr/share/ceph/mgr/orchestrator/module.py", line 841, in _daemon_add_osd
    raise_if_exception(completion)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 228, in raise_if_exception
    raise e
RuntimeError: cephadm exited with an error code: 1, stderr:Inferring config /var/lib/ceph/e8cde810-e4b8-11ed-9ba8-9
8039b976596/mon.ceph02/config
Non-zero exit code 1 from /bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/c
eph-volume --privileged --group-add=disk --init -e CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:0109babaa3be5e63fa5cd9d
d9a6bbc13e9651278b037d22261a25a70abb27a94 -e NODE_NAME=ceph02 -e CEPH_USE_RANDOM_NONCE=1 -e CEPH_VOLUME_OSDSPEC_AFF
INITY=None -e CEPH_VOLUME_SKIP_RESTORECON=yes -e CEPH_VOLUME_DEBUG=1 -v /var/run/ceph/e8cde810-e4b8-11ed-9ba8-98039
b976596:/var/run/ceph:z -v /var/log/ceph/e8cde810-e4b8-11ed-9ba8-98039b976596:/var/log/ceph:z -v /var/lib/ceph/e8cd
e810-e4b8-11ed-9ba8-98039b976596/crash:/var/lib/ceph/crash:z -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v / r
un/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /:/rootfs -v /tmp/ceph-tmp879n_fbx:/etc/ceph/ceph.conf:z -v /tmp/
ceph-tmp16g0ry_3:/var/lib/ceph/bootstrap-osd/ceph.keyring:z quay.io/ceph/ceph@sha256:0109babaa3be5e63fa5cd9dd9a6bbc
13e9651278b037d22261a25a70abb27a94 raw prepare --bluestore --data /dev/sdb
/bin/podman: stderr Running command: /usr/bin/ceph-authtool --gen-print-key
/bin/podman: stderr Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ce
ph/bootstrap-osd/ceph.keyring -i - osd new 1ef9800e-ff66-4bb1-bc11-a33dff9c0a59
/bin/podman: stderr  stderr: Error EEXIST: entity osd.0 exists but key does not match
/bin/podman: stderr Traceback (most recent call last):
/bin/podman: stderr   File "/usr/sbin/ceph-volume", line 11, in <module>
/bin/podman: stderr     load_entry_point('ceph-volume==1.0.0', 'console_scripts', 'ceph-volume')()
/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 41, in __init__
/bin/podman: stderr self.main(self.argv);
/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 59, in newfunc
/bin/podman: stderr return f(*a, **kw)
/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 153, in main
/bin/podman: stderr     terminal.dispatch(self.mapper, subcommand_args)
/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch
/bin/podman: stderr instance.main()
/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/main.py", line 32, in main
/bin/podman: stderr     terminal.dispatch(self.mapper, self.argv)
/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch
/bin/podman: stderr instance.main()
/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/prepare.py", line 169, in main
/bin/podman: stderr     self.safe_prepare(self.args)
/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/prepare.py", line 91, in safe_
prepare
/bin/podman: stderr self.prepare()
/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in is_root
/bin/podman: stderr return func(*a, **kw)
/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/prepare.py", line 125, in prep
are
/bin/podman: stderr     osd_fsid, json.dumps(secrets))
/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/util/prepare.py", line 176, in create_id
/bin/podman: stderr     raise RuntimeError('Unable to create a new OSD id')
/bin/podman: stderr RuntimeError: Unable to create a new OSD id
Traceback (most recent call last):
  File "/var/lib/ceph/e8cde810-e4b8-11ed-9ba8-98039b976596/cephadm.0317efb4d3a353d5a77e82f4a4f52582f06970d6aba66473
daecf92e26ee3a51", line 9309, in <module>
    main()
  File "/var/lib/ceph/e8cde810-e4b8-11ed-9ba8-98039b976596/cephadm.0317efb4d3a353d5a77e82f4a4f52582f06970d6aba66473
daecf92e26ee3a51", line 9297, in main
    r = ctx.func(ctx)
  File "/var/lib/ceph/e8cde810-e4b8-11ed-9ba8-98039b976596/cephadm.0317efb4d3a353d5a77e82f4a4f52582f06970d6aba66473
daecf92e26ee3a51", line 1941, in _infer_config
    return func(ctx)
  File "/var/lib/ceph/e8cde810-e4b8-11ed-9ba8-98039b976596/cephadm.0317efb4d3a353d5a77e82f4a4f52582f06970d6aba66473
daecf92e26ee3a51", line 1872, in _infer_fsid
    return func(ctx)
  File "/var/lib/ceph/e8cde810-e4b8-11ed-9ba8-98039b976596/cephadm.0317efb4d3a353d5a77e82f4a4f52582f06970d6aba66473
daecf92e26ee3a51", line 1969, in _infer_image
    return func(ctx)
  File "/var/lib/ceph/e8cde810-e4b8-11ed-9ba8-98039b976596/cephadm.0317efb4d3a353d5a77e82f4a4f52582f06970d6aba66473
daecf92e26ee3a51", line 1859, in _validate_fsid
    return func(ctx)
  File "/var/lib/ceph/e8cde810-e4b8-11ed-9ba8-98039b976596/cephadm.0317efb4d3a353d5a77e82f4a4f52582f06970d6aba66473
daecf92e26ee3a51", line 5366, in command_ceph_volume
    out, err, code = call_throws(ctx, c.run_cmd())
  File "/var/lib/ceph/e8cde810-e4b8-11ed-9ba8-98039b976596/cephadm.0317efb4d3a353d5a77e82f4a4f52582f06970d6aba66473
daecf92e26ee3a51", line 1661, in call_throws
    raise RuntimeError('Failed command: %s' % ' '.join(command))
RuntimeError: Failed command: /bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/sb
in/ceph-volume --privileged --group-add=disk --init -e CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:0109babaa3be5e63fa5
cd9dd9a6bbc13e9651278b037d22261a25a70abb27a94 -e NODE_NAME=ceph02 -e CEPH_USE_RANDOM_NONCE=1 -e CEPH_VOLUME_OSDSPEC
_AFFINITY=None -e CEPH_VOLUME_SKIP_RESTORECON=yes -e CEPH_VOLUME_DEBUG=1 -v /var/run/ceph/e8cde810-e4b8-11ed-9ba8-9
8039b976596:/var/run/ceph:z -v /var/log/ceph/e8cde810-e4b8-11ed-9ba8-98039b976596:/var/log/ceph:z -v /var/lib/ceph/
e8cde810-e4b8-11ed-9ba8-98039b976596/crash:/var/lib/ceph/crash:z -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -
v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /:/rootfs -v /tmp/ceph-tmp879n_fbx:/etc/ceph/ceph.conf:z -v /
tmp/ceph-tmp16g0ry_3:/var/lib/ceph/bootstrap-osd/ceph.keyring:z quay.io/ceph/ceph@sha256:0109babaa3be5e63fa5cd9dd9a
6bbc13e9651278b037d22261a25a70abb27a94 raw prepare --bluestore --data /dev/sdb

Multi-node simultaneous creation of osd stepping pit minutes | Struggling squirrels - blog (strugglesquirrel.com)

# remove osd.0 missing
ceph auth list |grep osd.0
ceph auth del osd.0
# re-add
ceph orch daemon add osd --method raw ceph02:/dev/sdb
# Completely delete the osd command
Remove OSD (manually) 
You can delete an OSD at runtime when you want to reduce the size of the cluster or replace hardware. For Ceph, an OSD is typically a Ceph daemon for a storage drive in the ceph-osd host. If your host has multiple storage drives, you may need ceph-osd to drop a daemon for each drive. In general, it is a good idea to check the capacity of the cluster to see if it has reached its capacity limit. Make sure your cluster is not reaching its ratio when you remove OSDs. near full
warn
Do not let your cluster get to its place when removing an OSD. Removing OSDs may cause the cluster to reach or exceed its .full ratiofull ratio
Remove OSD from CLUSTER
It's usually up and in until you remove the OSD. You need to take it off the cluster so Ceph can start rebalancing and replicating its data to other OSDs. :
ceph osd out {osd-num}
Watch Data Migration
Once you remove the OSDs from the out cluster, Ceph will start rebalancing the cluster by migrating the placement groups from the OSDs you removed. You can observe this process with ceph tools. :
ceph -w
You should see the placement group state change from active+clean, and finally when the migration completes. (Control-c to quit.) active, some degraded objects active+clean
notes
Sometimes, usually in "small" clusters with few hosts (such as small test clusters), the fact of getting outOSD can create a CRUSH edge case where some PGs are still stuck in that active+remapped state. If this happens to you, you should mark the OSD as:
ceph osd in {osd-num}
Go back to the initial state, then don't mark the out OSD, but set its weight to 0:
ceph osd crush reweight osd.{osd-num} 0
Afterwards, you can observe the data migration as it should end. The difference between marking an OSD out and reweighting it to 0 is that in the first case the weight of the bucket containing the OSD does not change, while in the second case the weight of the bucket is updated (and reduces the OSD Weights). In the case of "small" clusters, the reweight command may sometimes be preferred.
Stop OSD
After an OSD is taken out of the cluster, it may still be running. That is, the OSD may be up and out. Before removing an OSD from the configuration, you must stop it:
ssh {osd-host}
sudo systemctl stop ceph-osd@{osd-num}
Once you stop your OSD, it's down.
Remove OSD
This procedure removes the OSD from the cluster map, removes its authentication key, removes the OSD from the OSD map, and removes the OSD from the ceph.conf file. If your host has multiple drives, you may need to remove an OSD for each drive by repeating this process.
Make the cluster forget about OSDs first. This step removes the OSD from the CRUSH map, removing its authentication key. It is also removed from the OSD map. Note that the purge subcommand was introduced in Luminous, for older versions see the following:
ceph osd purge {id} --yes-i-really-mean-it
Navigate to the host ceph.conf where you hold the master copy of the cluster files:
ssh {admin-host}
cd /etc/ceph
vim ceph.conf
Remove the OSD entry from the file ceph.conf (if present):
[osd.1]
        host = {hostname}
From the host ceph.conf that holds the master copy of the cluster file, copy the updated ceph.conf file to the /etc/ceph directory on other hosts in the cluster.
If your Ceph cluster is older than Luminous, instead of using ceph osd purge, you need to do this step manually: ceph osd purge
Removes an OSD from the CRUSH graph so that it no longer receives data. You can also decompile the CRUSH map, remove the OSD from the device list, remove the device as an item in the host bucket or delete the host bucket (if it is in the CRUSH map and you intend to remove the host), recompile the map and set it. See Removing an OSD for details:
ceph osd crush remove {name}
Delete the OSD authentication key:
ceph auth del osd.{osd-num}
The value of cephfor in the path is . If your cluster name is different, use your cluster name instead. ceph-{osd-num}$cluster-$idceph
Remove OSDs:
ceph osd rm {osd-num}
For example:
ceph osd rm 1

Adding/Removing OSDs — Ceph Documentation

Guess you like

Origin blog.csdn.net/wxb880114/article/details/130500346