Node Tuning for ElasticSearch 2 (ElasticSearch Performance)

Summary

How many nodes an ElasticSearch cluster needs is difficult to answer in a definitive way, but we can break down the question into a few to help us better understand how to design the number of ElasticSearch nodes:

  1. How much data do you plan to process?
  2. How many search requests do you plan to handle?
  3. What is the complexity of the request?
  4. How many resources does each node have?
  5. How many indexes do you plan to build and how many applications are supported?

Version

elasticsearch version: elasticsearch-2.x

content

One cluster solves all problems?

The questions that need to be answered are far more than the above, but the fifth question is often overlooked by us, because a single ElasticSearch cluster has the ability to support multiple indexes, and it can also support the use of multiple different applications. We can process all the logs in the company under one ElasticSearch cluster, whether it is a simple query on a website, or a very complex analysis. Understanding the log requirements of how many applications a cluster can support can help us analyze the appropriate number of nodes

The number of nodes is related to memory

The number of nodes in ElasticSearch is limited by RAM. For a server or virtual machine, the physical or virtual memory we allocate is limited, which naturally limits the number of nodes we allocate.

Number of Universal Nodes - 3

If we are building an ElasticSearch cluster, a suitable number is 3. Why 3? To a large extent, a cluster of 3 nodes can prevent "split-brain" from appearing, although, for a distributed cluster, each node is peer, but we still need a master node master. This node undertakes the task of coordinating communication between itself and all other nodes. In ES, in addition to the above work, the master node also optimizes the storage of shards and replicas, and also handles issues such as indexing, writing data, and routing index optimization.

three monks voting

When there is a problem with the master node master and the slave node slave cannot communicate with the master node, the slave node will initiate an election to appoint a new master node, and the new master will take over all the work of the old master. If the old master recovers and joins the cluster, The new master will demote the old master to slave, so that there will be no conflicts. All this process is handled by ElasticSearch itself without any involvement from the user.

two monks voting

However, when there are only two nodes, one master (master) and one slave (slave), if there is a problem with the direct communication between the master and the slave, the slave node will promote itself to the master, but when the communication is restored, we will simultaneously There are two masters. Because at this time, from the perspective of the original master node (master), it thinks that there is a problem with the original slave node (slave), and now it still needs to rejoin as a slave. In this way, when there are two nodes, we have a situation where the cluster does not know which node to elect as the master node, which is what we usually call "brain splitting".

To prevent this from happening, the emergence of a third node would upset the balance and resolve the conflict.

Three monks still have problems

The problem of brain splitting also occurs in clusters with three or more nodes. In order to reduce the probability of occurrence, ElasticSearch provides a configuration discovery.zen.minimum_master_nodes , which stipulates that when electing a new master, the minimum required amount under a cluster is number of nodes. For example, with a 3 node cluster, this number is 2, 2 nodes prevents a single node from electing itself as master when it leaves the cluster, instead it waits until it rejoins the cluster. This value can be determined by a formula:

N/2 + 1

The value of N is the number of all nodes in the cluster.

sacrificing usability

One way to prevent the "brain split" situation in a two-node cluster is to set the node.data configuration of one of the nodes to false , so that this node will never become the master. Of course, this will also reduce the availability of the cluster.

summary

There is no conclusion about the number of nodes in the ElasticSearch cluster. The engineer of ElasticSearch also gave his similar opinions on Quora for reference.

refer to

Reference source:

How many nodes should an Elasticsearch cluster have?

What's the maximum number of nodes Elasticsearch can have? How many, max, have you used in practice?

Elasticsearch Internals: Networking Introduction

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326488075&siteId=291194637