Non-stop migration elasticsearch cluster

One, background

 ES cluster non-stop migration, the migration process does not affect business use. Used cluster version 6.3.0.

Second, the program

1, business access the cluster through the domain name;

2, the new machine building cluster;

3, the original cluster snapshot, if there is missing data can be restored from a snapshot;

4, old and new clusters merge, and force the old cluster to the new cluster data migration data through a balanced way;

5, the original old cluster offline.

Third, the implementation

1, the new machine building cluster approach

1) preparation machine (root set): Reference official website

vim / etc / Security / limits.conf 
released files and memory limit
 * Soft MEMLOCK Unlimited
 * Hard MEMLOCK Unlimited
 * - nofile 65536 
* - Core Unlimited 
entry into force: Exit sign in 
vim / etc / the sysctl.conf 
add 
vm.max_map_count = 262144 
vm. swappiness = 1 
entry into force: sysctl -p
View Code

2) Node configuration: Reference official website

The order of precedence for cluster settings is:
transient cluster settings
persistent cluster settings
settings in the elasticsearch.yml configuration file.
View Code

Precautions:

1) ES heap memory required within 32G, preferably 26G.

Official website   users say  Reference 1  Reference 2

More 2) shard, QPS will be lower ( more details ), the size of the officially recommended value Shard 20-40GB ( details ), 1GB per slice corresponding to heap memory clusters between 20-25 (26G may store 520 shard ) ( details );

On each node may store a number of fragments available heap size proportional to, but not compulsory Elasticsearch fixed limits. There is a good rule of thumb: GB ensures that for each node has been configured, the number of fragments is maintained at 20 or less. If a node has 30GB of heap memory, then it can be up to 600 slices, but within this limit range, the fewer the number of slices you set, the better ( more details ). Shard number of each node setting method , shard cluster allocation strategy setting method .

3) master node and the client memory and CPU are relatively small amount of the corresponding parameter can be set smaller.

2, a snapshot of the cluster method: by building their own Hadoop cluster to achieve (hdfs need to install plug-ins)

curl -XPUT http://XXX/_snapshot/hdfs_repo -H 'Content-Type: application/json' -d '
    {
    "type": "hdfs",
    "settings": {
        "uri": "hdfs://XXX:8800/",
        "path": "hdfs_repo",
        "compress": true
    }
    }'
View Code

3, old and new merged cluster: a master can be connected can initiate two clusters, so that two clusters can be combined into a cluster

Note: the same index will be overwritten if the cluster index RED part, and after confirming that no influence may be reallocated by slicing reroute

POST /_cluster/reroute?retry_failed=true&pretty
{
}
View Code

4, the cluster data migration method of: filtering configuration is achieved by slicing the data migration away from the old machine.

curl -XPUT http://XXX/_cluster/settings -H 'Content-Type: application/json' -d '
    {
  "transient": {
    "cluster.routing.allocation.exclude._name": "XXX"
  }
}'
View Code

Fourth, other

1, hot and cold separation data

1 , so that all fragments are not dispensed on the SSD; 
curl -XPUT " HTTP: // XXX / * / _ = 120s master_timeout Settings? " -H ' the Content-the Type: file application / JSON ' -d '
 {
   " index.routing .allocation.exclude.box_type " : " Hot " 
} '
 2 , so that the test can be partitioned SSD; 
the PUT test / _settings 
{ 
  " index.routing.allocation.include.box_type " : " Hot "
} 
3 , enable tag-aware policy (no it does not matter) 
PUT / _cluster / Settings 
{
    "transient" : {
        "cluster.routing.allocation.awareness.attributes": "box_type"
    }
}
View Code

 2, using esrally pressure-measurement

3, Kibana

4, rolling restart

1 , cluster data equalized suspend 
the PUT _cluster / Settings 
{ 
  " transient " : {
     " cluster.routing.rebalance.enable " : " none " 
  } 
} 
2 , prohibit fragmentation assigned 
the PUT _cluster / Settings 
{ 
  " transient " : {
     " Cluster. routing.allocation.enable " : " none " 
  } 
} 
. 3 , the refresh index (optional) 
curl -XPOST HTTP: // XXX / _flush / synced 
 
. 4, Where updating operation performed after restarting node 

5 , to allow re-slice assigned 
the PUT _cluster / Settings 
{ 
  " persistent " : {
     " cluster.routing.allocation.enable " : null 
  } 
} 
. 6 , to be repeated after 2 clustered recovery green ~ step 5 until the update is complete restart all the nodes
 7 , balanced cluster data recovery
View Code

 

Guess you like

Origin www.cnblogs.com/yuanzhenliu/p/11605204.html
Recommended