/bin/podman: stderr raise RuntimeError('Unable to create a new OSD id')
podman ps |grep osd.0
podman stop osd.0 container id
Re-add osd.0
cluster directory cd /var/lib/ceph/e8cde810-e4b8-11ed-9ba8-98039b976596/ 1109 ls 1110 rm -rf osd.0 1111 ceph orch daemon add osd --method raw ceph02:/dev/sdb 1112 ls 1113 dd if=/dev/zero of=/dev/sdb count=2048 bs=1M 1114 ceph orch daemon add osd --method raw ceph02:/dev/sdb
/bin/podman: stderr RuntimeError: Unable to create a new OSD id
[root@ceph02 e8cde810-e4b8-11ed-9ba8-98039b976596]# ceph orch daemon add osd --method raw ceph02:/dev/sdb Error EINVAL: Traceback (most recent call last): File "/usr/share/ceph/mgr/mgr_module.py", line 1446, in _handle_command return self.handle_command(inbuf, cmd) File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 171, in handle_command return dispatch[cmd['prefix']].call(self, cmd, inbuf) File "/usr/share/ceph/mgr/mgr_module.py", line 414, in call return self.func(mgr, **kwargs) File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 107, in <lambda> wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args, **l_kwargs) # noqa: E731 File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 96, in wrapper return func(*args, **kwargs) File "/usr/share/ceph/mgr/orchestrator/module.py", line 841, in _daemon_add_osd raise_if_exception(completion) File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 228, in raise_if_exception raise e RuntimeError: cephadm exited with an error code: 1, stderr:Inferring config /var/lib/ceph/e8cde810-e4b8-11ed-9ba8-9 8039b976596/mon.ceph02/config Non-zero exit code 1 from /bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/c eph-volume --privileged --group-add=disk --init -e CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:0109babaa3be5e63fa5cd9d d9a6bbc13e9651278b037d22261a25a70abb27a94 -e NODE_NAME=ceph02 -e CEPH_USE_RANDOM_NONCE=1 -e CEPH_VOLUME_OSDSPEC_AFF INITY=None -e CEPH_VOLUME_SKIP_RESTORECON=yes -e CEPH_VOLUME_DEBUG=1 -v /var/run/ceph/e8cde810-e4b8-11ed-9ba8-98039 b976596:/var/run/ceph:z -v /var/log/ceph/e8cde810-e4b8-11ed-9ba8-98039b976596:/var/log/ceph:z -v /var/lib/ceph/e8cd e810-e4b8-11ed-9ba8-98039b976596/crash:/var/lib/ceph/crash:z -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v / r un/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /:/rootfs -v /tmp/ceph-tmp879n_fbx:/etc/ceph/ceph.conf:z -v /tmp/ ceph-tmp16g0ry_3:/var/lib/ceph/bootstrap-osd/ceph.keyring:z quay.io/ceph/ceph@sha256:0109babaa3be5e63fa5cd9dd9a6bbc 13e9651278b037d22261a25a70abb27a94 raw prepare --bluestore --data /dev/sdb /bin/podman: stderr Running command: /usr/bin/ceph-authtool --gen-print-key /bin/podman: stderr Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ce ph/bootstrap-osd/ceph.keyring -i - osd new 1ef9800e-ff66-4bb1-bc11-a33dff9c0a59 /bin/podman: stderr stderr: Error EEXIST: entity osd.0 exists but key does not match /bin/podman: stderr Traceback (most recent call last): /bin/podman: stderr File "/usr/sbin/ceph-volume", line 11, in <module> /bin/podman: stderr load_entry_point('ceph-volume==1.0.0', 'console_scripts', 'ceph-volume')() /bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 41, in __init__ /bin/podman: stderr self.main(self.argv); /bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 59, in newfunc /bin/podman: stderr return f(*a, **kw) /bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 153, in main /bin/podman: stderr terminal.dispatch(self.mapper, subcommand_args) /bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch /bin/podman: stderr instance.main() /bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/main.py", line 32, in main /bin/podman: stderr terminal.dispatch(self.mapper, self.argv) /bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch /bin/podman: stderr instance.main() /bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/prepare.py", line 169, in main /bin/podman: stderr self.safe_prepare(self.args) /bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/prepare.py", line 91, in safe_ prepare /bin/podman: stderr self.prepare() /bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in is_root /bin/podman: stderr return func(*a, **kw) /bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/prepare.py", line 125, in prep are /bin/podman: stderr osd_fsid, json.dumps(secrets)) /bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/util/prepare.py", line 176, in create_id /bin/podman: stderr raise RuntimeError('Unable to create a new OSD id') /bin/podman: stderr RuntimeError: Unable to create a new OSD id Traceback (most recent call last): File "/var/lib/ceph/e8cde810-e4b8-11ed-9ba8-98039b976596/cephadm.0317efb4d3a353d5a77e82f4a4f52582f06970d6aba66473 daecf92e26ee3a51", line 9309, in <module> main() File "/var/lib/ceph/e8cde810-e4b8-11ed-9ba8-98039b976596/cephadm.0317efb4d3a353d5a77e82f4a4f52582f06970d6aba66473 daecf92e26ee3a51", line 9297, in main r = ctx.func(ctx) File "/var/lib/ceph/e8cde810-e4b8-11ed-9ba8-98039b976596/cephadm.0317efb4d3a353d5a77e82f4a4f52582f06970d6aba66473 daecf92e26ee3a51", line 1941, in _infer_config return func(ctx) File "/var/lib/ceph/e8cde810-e4b8-11ed-9ba8-98039b976596/cephadm.0317efb4d3a353d5a77e82f4a4f52582f06970d6aba66473 daecf92e26ee3a51", line 1872, in _infer_fsid return func(ctx) File "/var/lib/ceph/e8cde810-e4b8-11ed-9ba8-98039b976596/cephadm.0317efb4d3a353d5a77e82f4a4f52582f06970d6aba66473 daecf92e26ee3a51", line 1969, in _infer_image return func(ctx) File "/var/lib/ceph/e8cde810-e4b8-11ed-9ba8-98039b976596/cephadm.0317efb4d3a353d5a77e82f4a4f52582f06970d6aba66473 daecf92e26ee3a51", line 1859, in _validate_fsid return func(ctx) File "/var/lib/ceph/e8cde810-e4b8-11ed-9ba8-98039b976596/cephadm.0317efb4d3a353d5a77e82f4a4f52582f06970d6aba66473 daecf92e26ee3a51", line 5366, in command_ceph_volume out, err, code = call_throws(ctx, c.run_cmd()) File "/var/lib/ceph/e8cde810-e4b8-11ed-9ba8-98039b976596/cephadm.0317efb4d3a353d5a77e82f4a4f52582f06970d6aba66473 daecf92e26ee3a51", line 1661, in call_throws raise RuntimeError('Failed command: %s' % ' '.join(command)) RuntimeError: Failed command: /bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/sb in/ceph-volume --privileged --group-add=disk --init -e CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:0109babaa3be5e63fa5 cd9dd9a6bbc13e9651278b037d22261a25a70abb27a94 -e NODE_NAME=ceph02 -e CEPH_USE_RANDOM_NONCE=1 -e CEPH_VOLUME_OSDSPEC _AFFINITY=None -e CEPH_VOLUME_SKIP_RESTORECON=yes -e CEPH_VOLUME_DEBUG=1 -v /var/run/ceph/e8cde810-e4b8-11ed-9ba8-9 8039b976596:/var/run/ceph:z -v /var/log/ceph/e8cde810-e4b8-11ed-9ba8-98039b976596:/var/log/ceph:z -v /var/lib/ceph/ e8cde810-e4b8-11ed-9ba8-98039b976596/crash:/var/lib/ceph/crash:z -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys - v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /:/rootfs -v /tmp/ceph-tmp879n_fbx:/etc/ceph/ceph.conf:z -v / tmp/ceph-tmp16g0ry_3:/var/lib/ceph/bootstrap-osd/ceph.keyring:z quay.io/ceph/ceph@sha256:0109babaa3be5e63fa5cd9dd9a 6bbc13e9651278b037d22261a25a70abb27a94 raw prepare --bluestore --data /dev/sdb
# remove osd.0 missing ceph auth list |grep osd.0 ceph auth del osd.0 # re-add ceph orch daemon add osd --method raw ceph02:/dev/sdb # Completely delete the osd command
Remove OSD (manually) You can delete an OSD at runtime when you want to reduce the size of the cluster or replace hardware. For Ceph, an OSD is typically a Ceph daemon for a storage drive in the ceph-osd host. If your host has multiple storage drives, you may need ceph-osd to drop a daemon for each drive. In general, it is a good idea to check the capacity of the cluster to see if it has reached its capacity limit. Make sure your cluster is not reaching its ratio when you remove OSDs. near full warn Do not let your cluster get to its place when removing an OSD. Removing OSDs may cause the cluster to reach or exceed its .full ratiofull ratio Remove OSD from CLUSTER It's usually up and in until you remove the OSD. You need to take it off the cluster so Ceph can start rebalancing and replicating its data to other OSDs. : ceph osd out {osd-num} Watch Data Migration Once you remove the OSDs from the out cluster, Ceph will start rebalancing the cluster by migrating the placement groups from the OSDs you removed. You can observe this process with ceph tools. : ceph -w You should see the placement group state change from active+clean, and finally when the migration completes. (Control-c to quit.) active, some degraded objects active+clean notes Sometimes, usually in "small" clusters with few hosts (such as small test clusters), the fact of getting outOSD can create a CRUSH edge case where some PGs are still stuck in that active+remapped state. If this happens to you, you should mark the OSD as: ceph osd in {osd-num} Go back to the initial state, then don't mark the out OSD, but set its weight to 0: ceph osd crush reweight osd.{osd-num} 0 Afterwards, you can observe the data migration as it should end. The difference between marking an OSD out and reweighting it to 0 is that in the first case the weight of the bucket containing the OSD does not change, while in the second case the weight of the bucket is updated (and reduces the OSD Weights). In the case of "small" clusters, the reweight command may sometimes be preferred. Stop OSD After an OSD is taken out of the cluster, it may still be running. That is, the OSD may be up and out. Before removing an OSD from the configuration, you must stop it: ssh {osd-host} sudo systemctl stop ceph-osd@{osd-num} Once you stop your OSD, it's down. Remove OSD This procedure removes the OSD from the cluster map, removes its authentication key, removes the OSD from the OSD map, and removes the OSD from the ceph.conf file. If your host has multiple drives, you may need to remove an OSD for each drive by repeating this process. Make the cluster forget about OSDs first. This step removes the OSD from the CRUSH map, removing its authentication key. It is also removed from the OSD map. Note that the purge subcommand was introduced in Luminous, for older versions see the following: ceph osd purge {id} --yes-i-really-mean-it Navigate to the host ceph.conf where you hold the master copy of the cluster files: ssh {admin-host} cd /etc/ceph vim ceph.conf Remove the OSD entry from the file ceph.conf (if present): [osd.1] host = {hostname} From the host ceph.conf that holds the master copy of the cluster file, copy the updated ceph.conf file to the /etc/ceph directory on other hosts in the cluster. If your Ceph cluster is older than Luminous, instead of using ceph osd purge, you need to do this step manually: ceph osd purge Removes an OSD from the CRUSH graph so that it no longer receives data. You can also decompile the CRUSH map, remove the OSD from the device list, remove the device as an item in the host bucket or delete the host bucket (if it is in the CRUSH map and you intend to remove the host), recompile the map and set it. See Removing an OSD for details: ceph osd crush remove {name} Delete the OSD authentication key: ceph auth del osd.{osd-num} The value of cephfor in the path is . If your cluster name is different, use your cluster name instead. ceph-{osd-num}$cluster-$idceph Remove OSDs: ceph osd rm {osd-num} For example: ceph osd rm 1