ceph osd disk deletion operation
Extension: Deletion of osd disk(Here we take deletion of the osd.0 disk on node1 as an example)
1. Check osd disk status
[root@node1 ceph]# ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 0.00298 root default -3 0.00099 host node1 0 hdd 0.00099 osd.0 up 1.00000 1.00000 -5 0.00099 host node2 1 hdd 0.00099 osd.1 up 1.00000 1.00000 -7 0.00099 host node3 2 hdd 0.00099 osd.2 up 1.00000 1.00000
2, first mark it as out
[root@node1 ceph]# ceph osd out osd.0 marked out osd.0. [root@node1 ceph]# ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 0.00298 root default -3 0.00099 host node1 0 hdd 0.00099 osd.0 up 0 1.00000 可以看到权重为0,但状态还是UP -5 0.00099 host node2 1 hdd 0.00099 osd.1 up 1.00000 1.00000 -7 0.00099 host node3 2 hdd 0.00099 osd.2 up 1.00000 1.00000
3, then rm to delete, but first go to the node corresponding to ==osd.0== to stop the ceph-osd service. Otherwise, rm won’t work
[root@node1 ceph]# systemctl stop [email protected] [root@node1 ceph]# ceph osd rm osd.0 removed osd.0 [root@node1 ceph]# ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 0.00298 root default -3 0.00099 host node1 0 hdd 0.00099 osd.0 DNE 0 状态不再为UP了 -5 0.00099 host node2 1 hdd 0.00099 osd.1 up 1.00000 1.00000 -7 0.00099 host node3 2 hdd 0.00099 osd.2 up 1.00000 1.00000
4. Check the cluster status
[root@node1 ceph]# ceph -s cluster: id: 6788206c-c4ea-4465-b5d7-ef7ca3f74552 health: HEALTH_WARN 1 osds exist in the crush map but not in the osdmap There is a warning and it is not deleted in the crush algorithm services: mon: 3 daemons, quorum node1,node2,node3 mgr: node1(active), standbys: node2, node3 osd: 2 osds: 2 up, 2 in Found only two osd, indicating that osd.0 was deleted successfully data: Pools: 0 pools, 0 pgs objects: 0 objects, 0 B usage: 28 MiB used, 2.0 GiB / 2.0 GiB avail 3G changed to 2G, indicating successful deletion pgs:
5. Delete in crush algorithm and auth verification
[root@node1 ceph]# ceph osd crush remove osd.0 removed item id 0 name 'osd.0' from crush map
[root@node1 ceph]# ceph auth del osd.0 updated
6, you also need to uninstall on the node corresponding to ==osd.0==
[root@node1 ceph]# df -h |grep osd tmpfs 488M 48K 488M 1% /var/lib/ceph/osd/ceph-0 [root@node1 ceph]# umount /var/lib/ceph/osd/ceph-0
7, On the node corresponding to osd.0Delete the logical volume generated by the osd disk
[root@node1 ceph]# pvs PV VG Fmt Attr PSize PFree /dev/sdb ceph-56e0d335-80ba-40d8-b076-fc63a766dcac lvm2 a-- 1020.00m 0 [root@node1 ceph]# vgs VG #PV #LV #SN Attr VSize VFree ceph-56e0d335-80ba-40d8-b076-fc63a766dcac 1 1 0 wz--n- 1020.00m 0 [root@node1 ceph]# lvremove ceph-56e0d335-80ba-40d8-b076-fc63a766dcac Do you really want to remove active logical volume ceph-56e0d335-80ba-40d8-b076-fc63a766dcac/osd-block-ef26149d-5d7d-4cc7-8251-684fbddc2da5? [y/n]:y Logical volume "osd-block-ef26149d-5d7d-4cc7-8251-684fbddc2da5" successfully removed
At this point, it has been completely deleted
8. If you want to add it back, use the following command on the deployment node again.
[root@node1 ceph]# ceph-deploy disk zap node1 /dev/sdb [root@node1 ceph]# ceph-deploy osd create --data /dev/sdb node1
SAN
Classification of SANs
Two types of SAN:
-
FC-SAN: In the early days of SAN, data transmission between servers and switches was carried out through optical fibers. The servers transmitted SCSI commands to the storage devices and could not use ordinary LAN networks. IP protocol.
-
IP-SAN: A SAN encapsulated by the IP protocol can completely connect to the ordinary network, so it is called IP-SAN. The most typical one is ISCSI.
FC-SAN advantages and disadvantages: fast speed (2G, 8G, 16G), high cost, transmission distance has certain limitations.
IP-SAN advantages and disadvantages: slower speed (there is already a W-Gigabit Ethernet standard), low cost, and unlimited transmission distance.
IP-SAN iscsi implementation
Experiment: Linux platform implements IP-SAN through iscsi
Experiment preparation: Two virtual machines (centos7 platform) are in the same network segment (such as vmnet8). There is no need to simulate the switch, because the virtual machines in the same network segment are equivalent to being connected to the same switch.
-
Static IP, (the two IPs can communicate with each other, gateway and DNS are not required)
-
Both configure the host name and its host name to bind to each other.
-
Turn off firewall, selinux
-
Time synchronization
-
Configure yum (need to add epel source)
-
Simulate storage on the storage export end (simulated storage can use various forms, such as hard disk: /dev/sdb, partition: /dev/sdb1, soft raid: /dev/md0, logical volume: /dev/vg/lv01, dd creation large files, etc.)
For the convenience of experiment, I will use dd’s large file to simulate storage export# mkdir /data/ export# dd if=/dev/zero of=/data/storage1 bs=1M count=500 export# dd if=/dev/zero of=/data/storage2 bs=1M count=1000 export# dd if=/dev/zero of=/data/storage3 bs=1M count=1500 export# dd if=/dev/zero of=/data/storage4 bs=1M count=2000 One simulates 4 storage files for export (the sizes are different, for the purpose of subsequent discussion)
Experimental steps:
-
Install the software on the export side, configure the exported storage, and start the service.
-
Import installs software on the import side, imports storage, and starts services.
experiment procedure:
Step 1: Install the iscsi-target-utils package on the export side
export# yum install epel-release -y If the epel source is not installed, confirm the installation again export# yum install scsi-target-utils -y
Step 2: Configure the export of storage on the export side
export# cat /etc/tgt/targets.conf |grep -v "#" default-driver iscsi <target iscsi:data1> backing-store /data/storage1 </target> <target iscsi:data2> backing-store /data/storage2 </target> <target iscsi:data3> backing-store /data/storage3 </target> <target iscsi:data4> backing-store /data/storage4 </target>
Step 3: Start the service on the export side and verify
export# systemctl start tgtd export# systemctl enable tgtd Verify whether the port and shared resources are ok export# lsof -i:3260 export# tgt-admin --show
Step 4: Install the iscsi-initiator-utils package on the import side
import# yum install iscsi-initiator-utils
Step 5: Import storage on the import side
Before logging in, you must first connect and discover resources (discovery)
import# iscsiadm -m discovery -t sendtargets -p 10.1.1.11 10.1.1.11:3260,1 iscsi:data1 10.1.1.11:3260,1 iscsi:data2 10.1.1.11:3260,1 iscsi:data3 10.1.1.11:3260,1 iscsi:data4
After successfully discovering the resource, you can log in to the resource.
Only log in to one of the storages: import# iscsiadm -m node -T iscsi:data1 -p 10.1.1.11 -l Direct login to all discovered storage: import# iscsiadm -m node -l
After successful login, directly use fdisk -l to view
import# fdisk -l |grep sd[b-z]
Step 6: Start the service on the import side
Start the service and make it auto-start at boot import# systemctl start iscsi import# systemctl enable iscsi import# systemctl start iscsid import# systemctl enable iscsid
Supplement: Regarding the operation of canceling the connection
Cancel logging into a specific directory: Change -l to -u import# iscsiadm -m node -T iscsi:data1 -p 10.1.1.11 -u Unregister all directories: import# iscsiadm -m node -u If you want to delete even the discovery information, use the --op delete command import# iscsiadm -m node -T iscsi:data1 -p 10.1.1.11 --op delete Delete all logged-in directory information: import# iscsiadm -m node --op delete
Question 1: What will you find if you log in again a few times?
import# iscsiadm -m node -u &> /dev/null import# iscsiadm -m node -l &> /dev/null import# fdisk -l |grep sd[b-z]
Answer: You will find that the names will be confused. Solutions include udev and storage multipathing.
Question 2: If you add a new import server, the two import servers import the same storage, then format and mount it. Can simultaneous reading and writing be achieved?
Answer: No.
Extracurricular development: You can configure the verification function for the exported storage, and configure the correct user name and password on the importing end to log in
There are only two differences:
-
Add username and password verification functions when configuring the export side
<target iscsi:data1> Backing-store /data/storage1 incominguser daniel daniel123 verification function, this user can customize it and has nothing to do with system users </target>
-
When configuring the import side, you need to configure the following step, which corresponds to the user name and password of the export side.
If the export side is configured with the verification function, then the import side needs to configure the correct user name and password to be OK CHAP (Challenge-Handshake Authentication Protocol) Challenge Handshake Authentication Protocol import# vim /etc/iscsi/iscsid.conf 57 node.session.auth.authmethod = CHAP 61 node.session.auth.username = daniel 62 node.session.auth.password = daniel123 71 discovery.sendtargets.auth.authmethod = CHAP 75 discovery.sendtargets.auth.username = daniel 76 discovery.sendtargets.auth.password = daniel123 After completing this step, you can discover resources and log in
storage multipath
Storage multipath (device-multipath): equivalent to dual-line binding of storage lines, doing HA or LB.
effect:
-
Dual storage line HA
-
Dual storage line LB
-
You can customize the bound device name to achieve the purpose of fixing the iscsi device name.
Experiment preparation
-
Based on the previous experiment, add a network card to the export end and the import end to connect to a new network (Note: New The network segment must use static IP). My network segment here is 10.2.2.0/24
vmnet8 10.1.1.0/24 vmnet1 10.2.2.0/24
-
Then log out of these four storages on the storage import side and delete related information.
import# iscsiadm -m node -u import# iscsiadm -m node --op delete
experiment procedure
Step 1: On the storage import side, discover the storage on the export side. Use the == two IPs on the export side to discover respectively, and then log in to them.
import# iscsiadm -m discovery -t sendtargets -p 10.1.1.11 10.1.1.11:3260,1 iscsi:data1 10.1.1.11:3260,1 iscsi:data2 10.1.1.11:3260,1 iscsi:data3 10.1.1.11:3260,1 iscsi:data4 import# iscsiadm -m discovery -t sendtargets -p 10.2.2.11 10.2.2.11:3260,1 iscsi:data1 10.2.2.11:3260,1 iscsi:data2 10.2.2.11:3260,1 iscsi:data3 10.2.2.11:3260,1 iscsi:data4
Log in all discovered targets
import# iscsiadm -m node -l Use the fdisk -l |grep sd[b-z] command to view 8 storages (but there are actually 4 storages, accessed by two network lines respectively)
Step 2, install the device-mapper-multipath package on the storage import side
import# yum install device-mapper\*
Step 3, perform multi-path binding on the above 8, and bind them into 4 (the same storage is bound into one for two line access)
Run this command first to generate the configuration file /etc/multipath.conf
import# mpathconf --enable
Configure the /etc/multipath.conf configuration file
import# cat /etc/multipath.conf |grep -v ^# |grep -v ^$ defaults { See as global configuration parameters user_friendly_names yes Use friendly names (the default name is wwid, the name is long and difficult to recognize, the friendly name can be customized) find_multipaths yes } blacklist { blacklist (indicates that all devices in the blacklist will not be bound as multipath devices) } import# vim /etc/multipath.conf blacklist { devnode "^sda" Except for devices starting with sda, I do multipathing (referring to the 8 discovered devices) }
Start service
import# systemctl start multipathd.service import# systemctl enable multipathd.service
Step 4: Check the current binding status
Usemultipath -ll
command to see four newly bound devices (mpatha, mpathb, mpathc, mpathd). These four devices are dual-line bound devices
/lib/udev/scsi_id --whitelisted --replace-whitespace --device=/dev/sda
The command can be used to view the wwid
import# multipath -ll mpathd (360000000000000000e00000000040001) dm-3 IET ,VIRTUAL-DISK 长数字字符串就是wwid size=2.0G features='0' hwhandler='0' wp=rw |-+- policy='service-time 0' prio=1 status=active 主线路 | `- 8:0:0:1 sdf 8:80 active ready running `-+- policy='service-time 0' prio=1 status=enabled 备线路(也就是说默认为主备HA模式) `- 9:0:0:1 sdh 8:112 active ready running mpathc (360000000000000000e00000000030001) dm-2 IET ,VIRTUAL-DISK size=1.5G features='0' hwhandler='0' wp=rw |-+- policy='service-time 0' prio=1 status=active | `- 6:0:0:1 sde 8:64 active ready running `-+- policy='service-time 0' prio=1 status=enabled `- 7:0:0:1 sdg 8:96 active ready running mpathb (360000000000000000e00000000020001) dm-1 IET ,VIRTUAL-DISK size=1000M features='0' hwhandler='0' wp=rw |-+- policy='service-time 0' prio=1 status=active | `- 4:0:0:1 sdc 8:32 active ready running `-+- policy='service-time 0' prio=1 status=enabled `- 5:0:0:1 sdd 8:48 active ready running mpatha (360000000000000000e00000000010001) dm-0 IET ,VIRTUAL-DISK size=500M features='0' hwhandler='0' wp=rw |-+- policy='service-time 0' prio=1 status=active | `- 2:0:0:1 sda 8:0 active ready running `-+- policy='service-time 0' prio=1 status=enabled `- 3:0:0:1 sdb 8:16 active ready running
Step 5: Next, I will customize the binding of these 8 storages (bind the names into data1 and data2 respectively, and these two will be made into ha high availability mode; data3 and data4 will be made into lb load balancing mode)
import# cat /etc/multipath.conf |grep -v ^# |grep -v ^$ defaults { user_friendly_names yes find_multipaths yes } multipaths { multipath { wwid 360000000000000000e00000000010001 wwid value alias data1 Custom binding name path_grouping_policy failover HA mode failback immediate After the main line is down and restarted, it will be switched back immediately } multipath { wwid 360000000000000000e00000000020001 alias data2 path_grouping_policy failover failback immediate } multipath { wwid 360000000000000000e00000000030001 alias data3 path_grouping_policy multibus LB mode path_selector "round-robin 0" The algorithm of LB is rr round robin } multipath { wwid 360000000000000000e00000000040001 alias data4 path_grouping_policy multibus path_selector "round-robin 0" } } blacklist { }
Step 6: Restart the service to make the custom configuration take effect
import# systemctl restart multipathd.service Check and verify, it is bound to names like data1, data2, data3, data4 import# multipath -ll import# ls /dev/mapper/data* /dev/mapper/data1 /dev/mapper/data2 /dev/mapper/data3 /dev/mapper/data4
Step 7: Test (the test process is omitted, see the teaching video)
Choose one storage format in each of the failover and multibus modes (you can format it directly, or you can partition it first and then format it), and mount it for testing If /dev/mapper/data4 is divided into two partitions, the corresponding names are: /dev/mapper/data4p1, /dev/mapper/data4p2 (if you cannot see it after partitioning, you can use the partprobe command to refresh it)
centos7 needs to add one more parameter (_netdev) to /etc/fstab to successfully mount it automatically. The writing method is as follows /dev/mapper/data4p1 /mnt xfs defaults,_netdev 0 0