[Teach you through ELK] Elasticsearch cluster management

Elasticsearch is a distributed search and analysis engine. Its cluster management principle is based on the distributed architecture of shard and replica.

In Elasticsearch, each index is divided into multiple shards, and each shard is an independent Lucene index. Shards can be distributed among different nodes for horizontal scaling and high availability. In order to improve data redundancy and fault tolerance, each shard can have one or more copies, which are identical shard copies and can run on the same node or on different nodes.

The primary goal of Elasticsearch cluster management is to enable automatic allocation and redistribution of shards and replicas to ensure high availability and load balancing. When a new node joins the cluster, Elasticsearch automatically assigns some shards to the new node and replicates the replicas to the new node. If a node goes down or fails, Elasticsearch automatically reassigns shards and replicas to ensure data availability and integrity.

The following is a simple Elasticsearch cluster management architecture diagram, showing an Elasticsearch cluster consisting of three nodes:

     +----------+          +----------+          +----------+
     |  Node 1  |          |  Node 2  |          |  Node 3  |
     |          |          |          |          |          |
     |   Data   |          |   Data   |          |   Data   |
     |  Master  +----------+  Master  |          |  Master  |
     |   Node   |          |   Node   +----------+   Node   |
     |          |          |          |          |          |
     +----------+          +----------+          +----------+

In this architecture, each node has some data shards and some replicas. Among them, each node has a master node (Master Node), and the master node is responsible for coordinating the allocation and redistribution of fragments, as well as the overall management of the cluster. When a new node joins the cluster, the master node automatically assigns some shards to the new node and replicates the replicas to the new node. If a node goes down or fails, the master node automatically reassigns shards and replicas to ensure data availability and integrity.

The following is a simple example implementation of Elasticsearch cluster management:

  1. Start the Elasticsearch cluster

First, at least two Elasticsearch nodes need to be started in order to form a cluster. Both nodes can be started with the following command:

bin/elasticsearch -E node.name=node1 -E cluster.name=my_cluster -E path.data=data1 -E path.logs=log1
bin/elasticsearch -E node.name=node2 -E cluster.name=my_cluster -E path.data=data2 -E path.logs=log2

Among them, node.namespecify the node name, cluster.namespecify the cluster name, path.dataand path.logsspecify the storage paths of data and logs respectively.

  1. add node

To add a node to the cluster, the following command can be used:

PUT /_cluster/settings
{
  "transient": {
    "cluster.routing.allocation.enable": "all"
  }
}

This will enable shard auto-allocation so that new nodes can receive shards. Then, start a new node and specify the same cluster name, and the new node will automatically join the cluster.

  1. View cluster status

To view cluster status and information, you can use the following command:

GET /_cluster/health
GET /_cluster/stats

This will return information about cluster health, number of nodes, number of shards, number of indexes, etc.

  1. manage index

To manage indexes, you can use the following commands:

  • Create an index:
PUT /my_index
  • Delete the index:
DELETE /my_index
  • Get index information:
GET /my_index/_stats
  • Change index settings:
PUT /my_index/_settings
{
  "index": {
    "refresh_interval": "30s"
  }
}
  • Add documents to the index:
POST /my_index/_doc
{
  "title": "Elasticsearch Tutorial",
  "content": "This is a tutorial on Elasticsearch",
  "tags": ["elasticsearch", "tutorial"]
}
  • Search index:
GET /my_index/_search
{
  "query": {
    "match": {
      "title": "Elasticsearch"
    }
  }
}
  1. management node

To manage nodes, the following commands can be used:

  • View node information:
GET /_nodes
  • View specific node information:
GET /_nodes/node1
  • Shut down the node:
POST /_cluster/nodes/node1/_shutdown

This will shut down node1the node named . After a node shuts down, its shards will be automatically redistributed to other nodes.

Elasticsearch cluster management is commonly used in the following scenarios:

- Handle large-scale data: Elasticsearch can handle a large amount of structured and unstructured data, and is suitable for application scenarios that need to process large-scale data.

- High availability: Through the automatic allocation and reallocation of fragments and copies, Elasticsearch can achieve high availability and fault tolerance, which is suitable for application scenarios that require high availability.

- Realize load balancing: The Elasticsearch cluster can automatically distribute requests to different nodes and fragments to achieve load balancing and performance optimization.

Here are some links to literature on Elasticsearch cluster management:

- Elasticsearch official documentation: https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-cluster.html

- Introduction to the principles of Elasticsearch cluster management: https://www.elastic.co/cn/blog/a-deep-dive-into-elasticsearch-cluster-management

- Elasticsearch cluster management best practices: https://www.elastic.co/cn/blog/elasticsearch-cluster-management-best-practices

- Elasticsearch cluster size and performance optimization: https://www.elastic.co/cn/blog/how-many-shards-should-i-have-in-my-elasticsearch-cluster

- Elasticsearch cluster monitoring and debugging: https://www.elastic.co/cn/blog/monitoring-and-debugging-elasticsearch-performance-and-health

Guess you like

Origin blog.csdn.net/feng1790291543/article/details/132102566