ES important configuration

Although ES requires very little configuration, there are still some configurations that we need to configure manually, especially before the product goes online.

  • path.data and path.logs
  • cluster.name
  • node.name
  • bootstrap.memory_lock
  • network.host
  • discovery.zen.ping.unicast.hosts
  • discovery.zen.minimum_master_nodes

path.data and path.logs

If you are using a .zip or .tar.gz archive format, the data and log paths will be a subpath under $ES_HOME. If these paths use the default paths, there is a high risk of being removed when we upgrade to a new ES version. In a production environment, we usually want to change the data and log file paths.

path:
  logs: /var/log/elasticsearch
  data: /var/data/elasticsearch

Current RPM and Debain releases already use custom data and log paths. This path.data can be set to multiple paths, the data will be stored in all paths (files belonging to the same shard will be stored in the same data path)

path:
  data:
    - /mnt/elasticsearch_1
    - /mnt/elasticsearch_2
    - /mnt/elasticsearch_3

cluster.name

A node can only join one cluster, and it will share the cluster name with other nodes in the cluster. The default cluster name is elasticsearch, however, you can change it to an appropriate name depending on the purpose of your cluster.

cluster.name: cluster_en

node.name

By default, Elasticsearch will use a randomly generated UUID for the first seven characters of the node ID. Note that the node ID is persistent and does not change when the node is restarted, so the default node name also does not change. Defining a meaningful name is very valuable, and it also has the advantage of keeping the name unchanged after restarting the node.

node.name: prod-data-2

node.name can also use the HOSTNAME of the service, as follows:

node.name: ${HOSTNAME}

bootstrap.memory_lock

This is very important to ensure the health of the node, the memory of the JVM cannot be swapped to disk. One way to do this is to set bootstrap.memory_lock to true. For the effect of this configuration, some other system parameters need to be set first. For the detailed configuration of bootstrap.memory_lock, you can check the link of the official website https://www.elastic.co/guide/en/elasticsearch/reference/5.6/setup-configuration-memory.html#mlockall  (to be studied later, supplement the Chinese version )

network.host

By default, Elasticsearch binds to loopback addresses -- eg: 127.0.0.1 and [::1]. Running a single deployed node on a server works well.

TIP
In fact, under the same $ES_HOME path on a node, we can start multiple processes, which is very useful for verifying ES cluster features, but it is not recommended for production environments.

In order to facilitate the connection of nodes on different services to form a cluster, the nodes need to be bound to a non-loopback address.

network.host: 192.168.1.10

For the configuration of network.host, you need to understand some special values, such as _local_,  site , _global_ and some modifiers, such as: ip4 and ip:6, for more details, please refer to: https://www.elastic. co/guide/en/elasticsearch/reference/5.6/modules-network.html#network-interface-values

discovery.zen.ping.unicast.hosts

For a new cluster, without any network related configuration, ES will bind a loopback address and then try to connect to other nodes on that server by scanning 9300-9305. This is the ability to provide an automatic component clustering, without any configuration. When the nodes of the current cluster come from other servers, you have to provide a series of seed nodes to connect to, which can be in the following format:

discovery.zen.ping.unicast.hosts:
   - 192.168.1.10:9300
   - 192.168.1.11
   - seeds.mydomain.com

If no port is provided, transport.profiles.default.port is used by default. If no port is provided, it will fall back to using transport.tcp.port to provide the hostname, which will be resolved into multiple IP addresses and try to connect to all addresses.

discovery.zen.minimum_master_nodes

To prevent data loss, it will be important to set the discovery.zen.minimum_master_nodes parameter so that each candidate master node knows the minimum number of candidate master nodes that must be visible in order to form a cluster. Without this configuration, the cluster runs the risk of becoming two-brain queues in the event of a network failure. This will directly lead to data loss. To avoid split-brain, the number of candidate masternodes should be set to:

(master_eligible_nodes / 2) + 1

In other words, if there are three candidate nodes, then the minimum candidate node should be set to (3/2)+1 = 2

discovery.zen.minimum_master_nodes: 2

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325793282&siteId=291194637