The whole network the most detailed Ceph14.2.5 cluster deployment and configuration files Detailed, Check it out! - <2>

Ceph cluster deployment

Ceph version selection

Source Ceph version introduced

The latest version Ceph community is 14, while the Ceph 12 is the most widely used market stable version.
The first Ceph version 0.1, dates back to January 2008. Over the years, the version number of the program has not changed until after the April 2015 0.94.1 (Hammer first revision) released, in order to avoid 0.99 (or 1.00 and 0.100?), Developed a new strategy.

x.0.z - Developer Edition (for early testers and warriors)

x.1.z - Candidate (for test cluster, gurus)

x.2.z - stability revision (to users)

counting from 9 x, which represents Infernalis (I ninth letter), so that the ninth release cycle is the first development version 9.0.0; followed by the subsequent development version 9.0.1, 9.0.2, etc. Wait.
| Version Name | version number | Published |
| ------ | ------ | ------ |
| Argonaut | 0.48 version (LTS) | June 3, 2012 |
| Bobtail | 0.56 version (LTS) | 2013 Nian 5 Yue 7 Ri |
| CuttleFish | 0.61 version | 2013 Nian 1 Yue 1 Ri |
| Dumpling | 0.67 version (LTS) | 2013 Nian 8 Yue 14 Ri |
| Emperor | 0.72 version | 2013 November 9 |
| the Firefly | 0.80 version (LTS) | 2014 Year 5 Yue |
| Giant | Giant | October 2014 - April 2015 |
| Hammer | Hammer | April 2015 - November 2016 |
| infernalis | infernalis | November 2015 - June 2016 |
| Jewel | 10.2.9 | 2016 Nian 4 Yue |
| Kraken | 11.2.1 | 2017 Nian 10 Yue |
| Luminous | 12.2.12 | 2017 Nian 10 Yue |
| the MIMIC | 13.2.7 | 2018 Nian 5 Yue |
| Nautilus | 14.2.5 | 2019 Feb 2010 |

Luminous New Version Features

  • Bluestore
    • ceph-osd new back-end storage BlueStore has stabilized, the default setting OSD is newly created.
      BlueStore directly manage the physical HDD or SSD without using the intermediate file system, such as XFS, to manage data stored in each of the OSD, which provides greater performance and functionality.
    • BlueStore support the Ceph storage of all data and metadata integrity check.
    • BlueStore embedded support the use zlib, snappy or LZ4 compression. (Ceph also supports zstd be RGW compression, but for performance reasons, not for BlueStore recommended zstd)
  • Overall cluster scalability improved. We have successfully tested up to 10,000 OSD cluster.
  • ceph-mgr
    • ceph-mgr is a new background process, which must be part of any Ceph deployment. Although when ceph-mgr stop, IO can continue, but the measure does not refresh, and certain related metrics request (for example, ceph df) may be blocked. We recommend that you deploy more than a few instances ceph-mgr to achieve reliability.
    • ceph-mgr daemon daemon includes a REST-based API management. Note: API is still experimental in nature, there are some restrictions, but the future will be the basis for API management.
    • ceph-mgr Prometheus further comprises a plug.
    • ceph-mgr now have a Zabbix plug. Use zabbix_sender, it can send to Zabbix Server cluster failure event host. This makes it easy to monitor the status of Ceph cluster and sends a notification in case of failure.

Preparation Before Installation

  1. Installation Requirements
  • At least three Centos7 system for virtual machine deployment Ceph cluster. Hardware configuration: 2C4G, each machine further mount least three disks (each disk 5G)

    cephnode01 192.168.25.224  
    cephnode02 192.168.25.227  
    cephnode03 192.168.25.228  
  • Yum source server within the network, the hardware configuration 2C4G

    cephyumresource01 192.168.25.224
  1. Preparing the Environment (Ceph operate on three machines)
(1)关闭防火墙:
systemctl stop firewalld
systemctl disable firewalld
(2)关闭selinux:
sed -i 's/enforcing/disabled/' /etc/selinux/config
setenforce 0
(3)关闭NetworkManager
systemctl disable NetworkManager && systemctl stop NetworkManager
(4)添加主机名与IP对应关系:
vim /etc/hosts
192.168.25.224 cephnode01
192.168.25.227 cephnode02
192.168.25.228 cephnode03
(5)设置主机名:
hostnamectl set-hostname cephnode01
hostnamectl set-hostname cephnode02
hostnamectl set-hostname cephnode03
(6)同步网络时间和修改时区
systemctl restart chronyd.service && systemctl enable chronyd.service
cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
(7)设置文件描述符
echo "ulimit -SHn 102400" >> /etc/rc.local
cat >> /etc/security/limits.conf << EOF
* soft nofile 65535
* hard nofile 65535
EOF
(8)内核参数优化
cat >> /etc/sysctl.conf << EOF
kernel.pid_max = 4194303
vm.swappiness = 0 
EOF
sysctl -p
(9)在cephnode01上配置免密登录到cephnode02、cephnode03
ssh-copy-id root@cephnode02
ssh-copy-id root@cephnode03
(10)read_ahead,通过数据预读并且记载到随机访问内存方式提高磁盘读操作
echo "8192" > /sys/block/sda/queue/read_ahead_kb
(11) I/O Scheduler,SSD要用noop,SATA/SAS使用deadline
echo "deadline" >/sys/block/sd[x]/queue/scheduler
echo "noop" >/sys/block/sd[x]/queue/scheduler

Installation source network yum

1, the installation httpd, createrepo source and epel

yum install httpd createrepo epel-release -y

2, edit the source file yum

[root@cephyumresource01 ~]# cat << EOF | tee /etc/yum.repos.d/ceph.repo 
[Ceph]
name=Ceph packages for $basearch
baseurl=http://mirrors.163.com/ceph/rpm-nautilus/el7/\$basearch
enabled=1
gpgcheck=1
type=rpm-md
gpgkey=https://download.ceph.com/keys/release.asc
priority=1

[Ceph-noarch]
name=Ceph noarch packages
baseurl=http://mirrors.163.com/ceph/rpm-nautilus/el7/noarch
enabled=1
gpgcheck=1
type=rpm-md
gpgkey=https://download.ceph.com/keys/release.asc
priority=1

[ceph-source]
name=Ceph source packages
baseurl=http://mirrors.163.com/ceph/rpm-nautilus/el7/SRPMS
enabled=1
gpgcheck=1
type=rpm-md
gpgkey=https://download.ceph.com/keys/release.asc
EOF

3, download the installation package Ceph

yum --downloadonly --downloaddir=/var/www/html/ceph/rpm-nautilus/el7/x86_64/ install ceph ceph-radosgw 

4, Ceph download dependent files

wget -P /var/www/html/ceph/rpm-nautilus/el7/srpms/ mirrors.163.com/ceph/rpm-nautilus/el7/SRPMS/ceph-14.2.4-0.el7.src.rpm 
wget -P /var/www/html/ceph/rpm-nautilus/el7/srpms/ mirrors.163.com/ceph/rpm-nautilus/el7/SRPMS/ceph-deploy-2.0.1-0.src.rpm
 wget -P /var/www/html/ceph/rpm-nautilus/el7/noarch/ mirrors.163.com/ceph/rpm-nautilus/el7/noarch/ceph-deploy-2.0.1-0.noarch.rpm
 wget  -P /var/www/html/ceph/rpm-nautilus/el7/noarch/ mirrors.163.com/ceph/rpm-nautilus/el7/noarch/ceph-grafana-dashboards-14.2.4-0.el7.noarch.rpm 
 wget -P /var/www/html/ceph/rpm-nautilus/el7/noarch/ mirrors.163.com/ceph/rpm-nautilus/el7/noarch/ceph-mgr-dashboard-14.2.4-0.el7.noarch.rpm
 wget -P /var/www/html/ceph/rpm-nautilus/el7/noarch/ mirrors.163.com/ceph/rpm-nautilus/el7/noarch/ceph-mgr-diskprediction-cloud-14.2.4-0.el7.noarch.rpm
 wget -P /var/www/html/ceph/rpm-nautilus/el7/noarch/ mirrors.163.com/ceph/rpm-nautilus/el7/noarch/ceph-mgr-diskprediction-local-14.2.4-0.el7.noarch.rpm
 wget -P /var/www/html/ceph/rpm-nautilus/el7/noarch/ mirrors.163.com/ceph/rpm-nautilus/el7/noarch/ceph-mgr-rook-14.2.4-0.el7.noarch.rpm 
 wget -P /var/www/html/ceph/rpm-nautilus/el7/noarch/  mirrors.163.com/ceph/rpm-nautilus/el7/noarch/ceph-mgr-ssh-14.2.4-0.el7.noarch.rpm 
 wget -P /var/www/html/ceph/rpm-nautilus/el7/noarch/ mirrors.163.com/ceph/rpm-nautilus/el7/noarch/ceph-release-1-1.el7.noarch.rpm 
 wget -P /var/www/html/ceph/rpm-nautilus/el7/srpms/   mirrors.163.com/ceph/rpm-nautilus/el7/SRPMS/ceph-release-1-1.el7.src.rpm 
 wget -P /var/www/html/ceph/rpm-nautilus/el7/srpms/   mirrors.163.com/ceph/rpm-nautilus/el7/SRPMS/ceph-medic-1.0.4-16.g60cf7e9.el7.src.rpm
 wget  -P /var/www/html/ceph/rpm-nautilus/el7/noarch/  mirrors.163.com/ceph/rpm-nautilus/el7/noarch/repodata/repomd.xml 
 wget  -P /var/www/html/ceph/rpm-nautilus/el7/noarch/  mirrors.163.com/ceph/rpm-nautilus/el7/SRPMS/repodata/repomd.xml
 wget  -P /var/www/html/ceph/rpm-nautilus/el7/noarch/  mirrors.163.com/ceph/rpm-nautilus/el7/noarch/repodata/a4bf0ee38cd4e64fae2d2c493e5b5eeeab6cf758beb7af4eec0bc4046b595faf-filelists.sqlite
 wget  -P /var/www/html/ceph/rpm-nautilus/el7/noarch/repodata/  mirrors.163.com/ceph/rpm-nautilus/el7/noarch/repodata/a4bf0ee38cd4e64fae2d2c493e5b5eeeab6cf758beb7af4eec0bc4046b595faf-filelists.sqlite.bz2
 wget -P /var/www/html/ceph/rpm-nautilus/el7/noarch/repodata/ mirrors.163.com/ceph/rpm-nautilus/el7/noarch/repodata/183278bb826f5b8853656a306258643384a1547c497dd8b601ed6af73907bb22-other.sqlite.bz2 
 wget -P /var/www/html/ceph/rpm-nautilus/el7/srpms/repodata/ mirrors.163.com/ceph/rpm-nautilus/el7/SRPMS/repodata/52bf459e39c76b2ea2cff2c5340ac1d7b5e17a105270f5f01b454d5a058adbd2-filelists.sqlite.bz2
 wget -P /var/www/html/ceph/rpm-nautilus/el7/srpms/repodata/ mirrors.163.com/ceph/rpm-nautilus/el7/SRPMS/repodata/4f3141aec1132a9187ff5d1b4a017685e2f83a761880884d451a288fcedb154e-primary.sqlite.bz2
 wget -P /var/www/html/ceph/rpm-nautilus/el7/srpms/repodata/  mirrors.163.com/ceph/rpm-nautilus/el7/SRPMS/repodata/0c554884aa5600b1311cd8f616aa40d036c1dfc0922e36bcce7fd84e297c5357-other.sqlite.bz2 
 wget -P /var/www/html/ceph/rpm-nautilus/el7/noarch/repodata/ mirrors.163.com/ceph/rpm-nautilus/el7/noarch/repodata/597468b64cddfc386937869f88c2930c8e5fda3dd54977c052bab068d7438fcb-primary.sqlite.bz2

5, yum update source

createrepo --update  /var/www/html/ceph/rpm-nautilus

Ceph cluster installation

1, edit the network source yum, yum source will be synchronized to other nodes and check yum makecache advance

# vim /etc/yum.repos.d/ceph.repo 
[Ceph]
name=Ceph packages for $basearch
baseurl=http://192.168.25.224/ceph/rpm-nautilus/el7/$basearch
gpgcheck=0
priority=1

[Ceph-noarch]
name=Ceph noarch packages
baseurl=http://192.168.25.224/ceph/rpm-nautilus/el7/noarch
gpgcheck=0
priority=1

[ceph-source]
name=Ceph source packages
baseurl=http://192.168.25.224/ceph/rpm-nautilus/el7/srpms
gpgcheck=0
priority=1

2, the installation ceph-deploy (confirmation ceph-deploy version is 2.0.1)

# yum list|grep ceph-deploy
# yum install -y ceph-deploy

3, create a my-cluster directory, all commands (file location and name are free) in this directory

# mkdir /my-cluster
# cd /my-cluster

4, create a Ceph cluster

# ceph-deploy new cephnode01 cephnode02 cephnode03 

Run error: ImportError: No module named pkg_resources

This problem is usually due to an upgrade to execute produced after python2.7 pip, pip solution is reinstalled in python2.7 environment, the steps of

1. Install distribute

wget https://pypi.python.org/packages/source/d/distribute/distribute-0.7.3.zip --no-check-certificate
unzip distribute-0.7.3.zip
cd distribute-0.7.3
python setup.py install

2. Install setuptool
https://pypi.python.org/pypi/setuptools download the latest version

wget --no-check-certificate https://pypi.python.org/packages/source/s/setuptools/setuptools-12.0.3.tar.gz#md5=f07e4b0f4c1c9368fcd980d888b29a65
tar -zxvf setuptools-12.0.3.tar.gz
cd setuptools-12.0.3
python setup.py install

3. Install PIP
easy_install PIP

4. If the installation process of reporting the pip ImportError: No module named extern abnormal

After downloading the latest https://pypi.python.org/pypi/extern extern try to install again

After solving the problem re-run: ceph-deploy new cephnode01 cephnode02 cephnode03

5, the installation software Ceph (performed for each node)

# yum -y install epel-release &&  yum install -y ceph

6, monitor detector generates keys used by the cluster

# ceph-deploy mon create-initial

7, install Ceph CLI, easy to perform some management commands

# ceph-deploy admin cephnode01 cephnode02 cephnode03

8, configuration mgr, for managing clusters

# ceph-deploy mgr create cephnode01 cephnode02 cephnode03

9, deploy rgw

# yum install -y ceph-radosgw
# ceph-deploy rgw create cephnode01

10, the deployment of MDS (CephFS)

# ceph-deploy mds create cephnode01 cephnode02 cephnode03 

11, add osd

ceph-deploy osd create --data /dev/sdb cephnode01
ceph-deploy osd create --data /dev/sdc cephnode01
ceph-deploy osd create --data /dev/sdd cephnode01
ceph-deploy osd create --data /dev/sdb cephnode02
ceph-deploy osd create --data /dev/sdc cephnode02
ceph-deploy osd create --data /dev/sdd cephnode02
ceph-deploy osd create --data /dev/sdb cephnode03
ceph-deploy osd create --data /dev/sdc cephnode03
ceph-deploy osd create --data /dev/sdd cephnode03

ceph.conf

1, the configuration file is init file syntax, #, and; as a comment, ceph cluster start time will load all conf configuration files in order. Profile configuration is divided into several large pieces.

global:全局配置。
osd:osd专用配置,可以使用osd.N,来表示某一个OSD专用配置,N为osd的编号,如0、2、1等。
mon:mon专用配置,也可以使用mon.A来为某一个monitor节点做专用配置,其中A为该节点的名称,ceph-monitor-2、ceph-monitor-1等。使用命令 ceph mon dump可以获取节点的名称。
client:客户端专用配置。

2, the configuration file can be loaded from more than one place the order, if the conflict will load the latest configuration, loaded order.

$CEPH_CONF环境变量
-c 指定的位置
/etc/ceph/ceph.conf
~/.ceph/ceph.conf
./ceph.conf

3, the configuration file may be used to apply some meta profile variables, such as.

$cluster:当前集群名。
$type:当前服务类型。
$id:进程的标识符。
$host:守护进程所在的主机名。
$name:值为$type.$id。

4, ceph.conf detailed parameters

[global]#全局设置
fsid = xxxxxxxxxxxxxxx                           #集群标识ID 
mon host = 10.0.1.1,10.0.1.2,10.0.1.3            #monitor IP 地址
auth cluster required = cephx                    #集群认证
auth service required = cephx                           #服务认证
auth client required = cephx                            #客户端认证
osd pool default size = 3                               #最小副本数 默认是3
osd pool default min size = 1                           #PG 处于 degraded 状态不影响其 IO 能力,min_size是一个PG能接受IO的最小副本数
public network = 10.0.1.0/24                            #公共网络(monitorIP段) 
cluster network = 10.0.2.0/24                           #集群网络
max open files = 131072                                 #默认0#如果设置了该选项,Ceph会设置系统的max open fds
mon initial members = node1, node2, node3               #初始monitor (由创建monitor命令而定)
##############################################################
[mon]
mon data = /var/lib/ceph/mon/ceph-$id
mon clock drift allowed = 1                             #默认值0.05#monitor间的clock drift
mon osd min down reporters = 13                         #默认值1#向monitor报告down的最小OSD数
mon osd down out interval = 600      #默认值300      #标记一个OSD状态为down和out之前ceph等待的秒数
##############################################################
[osd]
osd data = /var/lib/ceph/osd/ceph-$id
osd mkfs type = xfs                                     #格式化系统类型
osd max write size = 512 #默认值90                       #OSD一次可写入的最大值(MB)
osd client message size cap = 2147483648 #默认值100      #客户端允许在内存中的最大数据(bytes)
osd deep scrub stride = 131072 #默认值524288         #在Deep Scrub时候允许读取的字节数(bytes)
osd op threads = 16 #默认值2                         #并发文件系统操作数
osd disk threads = 4 #默认值1                        #OSD密集型操作例如恢复和Scrubbing时的线程
osd map cache size = 1024 #默认值500                 #保留OSD Map的缓存(MB)
osd map cache bl size = 128 #默认值50                #OSD进程在内存中的OSD Map缓存(MB)
osd mount options xfs = "rw,noexec,nodev,noatime,nodiratime,nobarrier" #默认值rw,noatime,inode64  #Ceph OSD xfs Mount选项
osd recovery op priority = 2 #默认值10               #恢复操作优先级,取值1-63,值越高占用资源越高
osd recovery max active = 10 #默认值15               #同一时间内活跃的恢复请求数 
osd max backfills = 4  #默认值10                     #一个OSD允许的最大backfills数
osd min pg log entries = 30000 #默认值3000           #修建PGLog是保留的最大PGLog数
osd max pg log entries = 100000 #默认值10000         #修建PGLog是保留的最大PGLog数
osd mon heartbeat interval = 40 #默认值30            #OSD ping一个monitor的时间间隔(默认30s)
ms dispatch throttle bytes = 1048576000 #默认值 104857600 #等待派遣的最大消息数
objecter inflight ops = 819200 #默认值1024           #客户端流控,允许的最大未发送io请求数,超过阀值会堵塞应用io,为0表示不受限
osd op log threshold = 50 #默认值5                   #一次显示多少操作的log
osd crush chooseleaf type = 0 #默认值为1             #CRUSH规则用到chooseleaf时的bucket的类型
##############################################################
[client]
rbd cache = true #默认值 true      #RBD缓存
rbd cache size = 335544320 #默认值33554432           #RBD缓存大小(bytes)
rbd cache max dirty = 134217728 #默认值25165824      #缓存为write-back时允许的最大dirty字节数(bytes),如果为0,使用write-through
rbd cache max dirty age = 30 #默认值1                #在被刷新到存储盘前dirty数据存在缓存的时间(seconds)
rbd cache writethrough until flush = false #默认值true  #该选项是为了兼容linux-2.6.32之前的virtio驱动,避免因为不发送flush请求,数据不回写
              #设置该参数后,librbd会以writethrough的方式执行io,直到收到第一个flush请求,才切换为writeback方式。
rbd cache max dirty object = 2 #默认值0              #最大的Object对象数,默认为0,表示通过rbd cache size计算得到,librbd默认以4MB为单位对磁盘Image进行逻辑切分
      #每个chunk对象抽象为一个Object;librbd中以Object为单位来管理缓存,增大该值可以提升性能
rbd cache target dirty = 235544320 #默认值16777216    #开始执行回写过程的脏数据大小,不能超过 rbd_cache_max_dirty

Guess you like

Origin www.cnblogs.com/passzhang/p/12151835.html