Snapshot of elasticsearch data backup migration

1. Introduction to snapshots

ES provides snapshot and recovery functions. We can create snapshots for part of the index or the entire cluster separately in remote file system warehouses (such as shared file systems, S3, HDFS, etc.). These snapshots are very useful for backups, and they can be restored relatively quickly. The advantages are as follows:

迁移速度快,适用数据量大的场景;
需占用源集群磁盘空间,或者借助于对象存储,实现友商ES到腾讯云ES,或自建ES到腾讯云ES的数据迁移;
存储类型有共享文件系统、AWS 的 S3存储、HDFS、微软 Azure的存储、Google Cloud 的存储。

Special Instructions

1> 为方便验证glusterfs和NFS文件系统,本文同时使用了此两种文件系统的方式;分别在老es集群上通过glusterfs文件系统存放es集群的快照和在新es集群上通过NFS文件系统存放es集群的快照。
2> 使用fs文件系统的目的是为了保证备份和恢复快照时数据的一致性

2. The original es cluster environment

system version jdk version ES cluster version Number of nodes
CentOS 7.1.1503 (Core) 1.8.0_112 6.5.1 3

Insert picture description here

3. Create a shared file system glusterfs (used by the warehouse snapshot storage of the original es cluster)

3.1, Glusterfu deployment (18.11 and 18.22)

The reference link is as follows:
https://blog.csdn.net/weixin_44729138/article/details/105663849

3.2. The server creates a volume to store es snapshots

mkdir -p /u01/isi/esback
gluster volume create volume-es replica 2 192.168.18.11:/u01/isi/esback  192.168.18.22:/u01/isi/esback   #回车后按y

Insert picture description here

gluster volume info volume-es
gluster volume start  volume-es

Insert picture description here

3.3. The client hangs on the volume (each node of the original es cluster)

yum install -y glusterfs glusterfs-fuse  #安装glusterfs客户端
mkdir /u01/isi/snapshot #创建挂在目录,es的快照存放路径
mount -t glusterfs 192.168.18.22:/volume-es /u01/isi/snapshot/ #挂在glusterfs卷

Insert picture description here
Insert picture description here

4. Elasticsearch cluster configuration (formerly es cluster)

cat config/elasticsearch.yml |grep repo   #每个节点
path.repo: ["/u01/isi/snapshot"] #镜像仓库目录

Insert picture description here
Restart the cluster

5. Register the repository (formerly es cluster)

chown -R isi:isi /u01/isi/snapshot  #调整仓库目录权限

Register to the warehouse

curl  -XPUT 'http://192.168.18.15:9200/_snapshot/my_backup' -d '{
  "type": "fs",
  "settings": {
  "location": "/u01/isi/snapshot",
  "compress": true
  }
  }'  #快照可以在集群中的任意节点上注册

Or enter the following command in the Kibana console to register

PUT _snapshot/my_backup
    {
    
    
      "type": "fs",
      "settings": {
    
    
      "location": "/u01/isi/snapshot",
      "compress": true
      }
    }

Insert picture description here
View the created warehouse

curl -XGET 'http://192.168.18.15:9200/_snapshot/my_backup?pretty'  

Insert picture description here
In addition to the location parameter, you can also use max_snapshot_bytes_per_secand max_restore_bytes_per_secto limit the speed of backup and restore

PUT _snapshot/my_backup
    {
    
    
      "type": "fs",
      "settings": {
    
    
        "location": "/u01/isi/snapshot",
        "compress": true,

        "max_snapshot_bytes_per_sec" : "50mb", 
        "max_restore_bytes_per_sec" : "50mb"
      }
}

6. Backup index (formerly es cluster)

After the warehouse is created, you can start backing up. A warehouse can contain multiple snapshots. A snapshot can store all indexes, partial indexes or a single index. You can assign a unique name to the index:

curl -XPUT 'http://192.168.18.15:9200/_snapshot/my_backup/snapshot_1'

The above code will back up all the running indexes to the next snapshot called snapshot_1 in the warehouse. The above api will return immediately, and the backup job will run in the background. If you want the api to execute synchronously, you can add the wait_for_completion flag:

curl -XPUT  'http://192.168.18.15:9200/_snapshot/my_backup/snapshot_1?wait_for_completion=true'  #备份快照可以在集群中的任意节点上操作

Insert picture description here
View index snapshot (operation on the original es cluster)

curl -XGET 'http://192.168.18.19:9200/_snapshot/my_backup/snapshot_1'  #快照可以在集群中的任意节点上查看

Insert picture description here

cd /u01/isi/snapshot
tar -czvf snapshot.tar.gz ./*  #将备份的索引打包

7. New es cluster cluster environment

system version jdk version ES cluster version Number of nodes
CentOS 7.1.1503 (Core) 1.8.0_112 6.5.1 3

Insert picture description here

8. Create a shared file system NFS (used for warehouse snapshot storage of the new es cluster)

8.1, NFS deployment

The reference link is as follows:
https://blog.csdn.net/weixin_44729138/article/details/106048003
Insert picture description here
Description:
18.9 is the server, and the shared directory is /u01/isi/snapshot
18.29 and 18.17 are the client

8.2, the client mounts the shared volume

mkdir /u01/isi/snapshot   #客户端创建挂在目录
mount -t nfs 192.168.18.9:/u01/isi/snapshot  /u01/isi/snapshot  #客户端挂在共享目录

Insert picture description here

9. Elasticsearch cluster configuration (new es cluster)

cat config/elasticsearch.yml |grep repo   #每个节点
path.repo: ["/u01/isi/snapshot"] #镜像仓库目录

Insert picture description here
Restart the cluster

10. Migrate data (restore snapshots)

Copy the data of the old ES cluster snapshot to the shared directory of the nfs server through tools such as scp, rsync or fz, and then decompress it

tar -xf /u01/isi/snapshot/snapshot.tar.gz -C /u01/isi/snapshot/ #在nfs的服务端所在的机器上解压

Insert picture description here

curl -XPOST 'http://192.168.18.9:9200/_snapshot/my_backup/snapshot_1/_restore?wait_for_completion=true'  #恢复快照,es集群的任意节点

Insert picture description here
Note: By default, .kibana needs to be closed or deleted, otherwise it will conflict, and the execution of the same index will fail when cross-cluster recovery.

curl -XDELETE http://192.168.18.17:9200/.kibana_1  #删除索引
curl -XPOST http://192.168.18.17:9200/.monitoring-kibana-6-2020.12.02/_close?pretty  #关闭索引
curl -XPOST http://192.168.18.17:9200/.monitoring-es-6-2020.12.02/_close?pretty  #关闭索引
curl -XPOST 'http://192.168.18.9:9200/_snapshot/my_backup/snapshot_1/_restore?wait_for_completion=true'  #重新执行恢复的操作命令

Insert picture description here
View the progress of index recovery:
Insert picture description here
Insert picture description here

11. Summary of related commands

Register to the warehouse

curl  -XPUT 'http://192.168.18.15:9200/_snapshot/my_backup' -d '{
  "type": "fs",
  "settings": {
  "location": "/u01/isi/snapshot",
  "compress": true
  }
  }' 

View the created warehouse

curl -XGET 'http://192.168.18.9:9200/_snapshot/my_backup?pretty' 

Back up all indexes

curl -XPUT 'http://192.168.18.9:9200/_snapshot/my_backup/snapshot_1' 
curl -XPUT  'http://192.168.18.9:9200/_snapshot/my_backup/snapshot_1?wait_for_completion=true'   #如果想等到备份完成,可以加上参数wait_for_completion=true

Back up some indexes

curl -XPUT http://192.168.18.9:9200/_snapshot/my_backup/snapshot_part
{
    
    
    "indices": "index_1,index_2"
}
默认是备份所有的索引indices, 如果要指定index,可以用此命令,这个备份过程需要的时间视数据量而定.

Delete backup

curl -XDELETE 'http://192.168.18.9:9200/_snapshot/my_backup/snapshot_1'

View the status of the created backup

curl -XGET 'http://192.168.18.9:9200/_snapshot/my_backup/snapshot_1/_status'

Restore index

curl -XPOST 'http://192.168.18.9:9200/_snapshot/my_backup/snapshot_1/_restore' #恢复所有索引
curl -XPOST 'http://192.168.18.9:9200/_snapshot/my_backup/snapshot_1/_restore?wait_for_completion=true'    #如果想要等待恢复结果,可以加上参数wait_for_completion=true

Restore part of the index

curl -XPOST http://192.168.18.9:9200/_snapshot/my_backup/snapshot_1/_restore
{
    
    
    "indices": "index_1",
    "rename_pattern": "index_(.+)",
    "rename_replacement": "restored_index_$1"
}
上面的indices, 表示只恢复索引’index_1’
rename_pattern: 表示重命名索引以’index_’开头的索引.
rename_replacement: 表示将所有的索引重命名为’restored_index_xxx’.如index_1会被重命名为restored_index_1.

View recovery progress

curl -XGET http://192.168.18.9:9200/_recovery/  #查看所有索引的恢复进度
curl -XGET http://192.168.18.9:9200/_recovery/restored_index_1  #查看索引restored_index_1的恢复进度

Delete the restored index

curl -XDELETE http://192.168.18.9:9200/restored_index_1  #取消恢复只需要删除索引,即可取消恢复

Guess you like

Origin blog.csdn.net/weixin_44729138/article/details/110484922