Es cut case of large clusters have a copy of how to restart cluster es

When you need to perform a rolling restart of the cluster, the cluster will remain online and running, but once make a node offline.

Common cause is Elasticsearch upgrade the server itself or some kind of maintenance (such as OS updates or hardware). In any case, there is a special way to perform a rolling restart.

Essentially, Elasticsearch want your data to be copied completely and balanced. If you turn off a single node for maintenance, the cluster node will immediately recognize the loss and start re-balancing. If you know the node maintains the short term, this may be annoying, because rebalance a very large slice may take some time.

We want to do is tell Elasticsearch postpone rebalancing, because the state of our cluster of external factors have more understanding. Procedures are as follows:

  1. If possible, please stop index of the new data and perform synchronous refresh. This is not always feasible, but would help speed up recovery time. Synchronous refresh request is "best effort" operation. If there are any pending indexing operation, it will fail, but if necessary, reissue the request several times safely.

    POST / _flush / synchronous
  2. Disable slice allocation. This prevents Elasticsearch rebalance the lost fragments, unless you specified otherwise. If you know the maintenance window is very short, it's a good idea. You can disable assigned as follows:

    curl xput  -d'{"transient":{"cluster.routing.allocation.enable":"none"}}'
  3. Stop the need to restart the node: curl XPUT HTTP: // ip: Port / _cluster / the Node / _local / _Shutdown

  4. Restart the node, and confirm that it joins the cluster.

  5. Repeat steps 3 and 4 of the remaining nodes.

  6. Re-enable the slice allocated as follows:

    curl xput http://ip:9092/_cluster/shard/setting -d'{"transient":{"cluster.routing.allocation.enable":"all"}}'

    Debris rebalancing may take some time. Wait until the cluster returns to the stategreen before continuing.

  7. At this point, you can safely restore index (if you have stopped before), but wait for a perfectly balanced cluster will help speed up the process of recovery before the index.


Guess you like

Origin blog.51cto.com/12182612/2438728