ES index planning scheme

ES index planning scheme

1 Introduction

"ES Index Planning Scheme" is a solution for the R&D department to store and query massive log data in real time according to the requirements of the audit system. It has been continuously improved and compiled into a book for subsequent relevant developers to learn and use

1.1. Terminology

serial number term illustrate
1 timing index Taking the time as the axis, the data only increases but does not change, and must contain a time stamp (date time, random name) field. i.e. index split by time
2 Index HOT The index stored in the HOT data node in the ES, it is best to use an SSD disk with certain fragmentation, mainly dealing with the real-time writing of time series data
3 Index WARM The index stored in the WARM data node in the ES, without fragmentation, a conventional large-capacity disk is sufficient, and can be used for query, but no longer written
4 index DELETE Delete the index in ES, that is, the data will be deleted

1.2. Abbreviations

serial number original word abbreviation illustrate
1 Elasticsearch ES Elasticsearch is a distributed, highly scalable, high real-time search and data analysis engine; developed based on Lucene;

2. Planning goals

Since the log system will generate a large number of logs, especially in the case of cluster deployment, it will generate a large amount of logs. Facing such a data-level demand, how to store our data and realize real-time query will face a severe challenge After multi-party research on ES and the verification of insertion and aggregation queries of more than tens of billions of data, the following solutions can effectively improve performance and solve this problem, including cluster planning, storage strategy, and index splitting. The optimization schemes in several dimensions, such as cold and hot partitions, are introduced in this article one by one.

3. Index overall planning

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-kELvG186-1673322267901)(RackMultipart20230110-1-yu4z31_html_5b1572b19b93523b.png)]

4. Index naming rules

4.1. Naming convention

The index is limited by the file system, only lowercase letters are possible, and cannot start with an underscore, and the following rules are also obeyed:

  • Cannot include, /, *, ?, ", <, >, |, spaces, commas, #
  • The colon : could be used before version 7.0, but it is deprecated and no longer supported after version 7.0
  • Cannot begin with these characters -, _, +
  • Cannot include. or…
  • Length cannot exceed 255 characters

These naming restrictions are because when Elasticsearch uses index names as on-disk directory names, these names must conform to the conventions of different operating systems.

4.2. Naming rules

系统:csbit\_logs\_audit\_tradition\_20210104\_000001

索引以:csbit\_logs\_audit\_tradition\_ 开头 当天的年.月.日即:yyyymmdd\_000001 结尾

5. Index split planning

Index implements distributed storage by horizontally expanding shards, which can solve the problem of index big data storage. However, as an index becomes larger and larger, a single shard becomes larger and larger, and the speed of query and storage becomes slower and slower. More importantly, an index actually has a storage limit (unless you set up enough shards and machines), such as the official statement that the number of documents in a single shard cannot exceed 2 billion (limited by Lucene index, each shard is a Lucene index ), considering I and O, in the face of such a huge index, should we use more shards, or how do we choose more indexes? More I, O overhead, in fact, the answer is already very clear, unless you can accept long query waiting. In order to avoid oversized indexes and improve ES query efficiency, we need to split indexes

5.1. Planning

The idea of ​​index splitting is very simple. One advantage of time-series index is that it can only increase without change, and accumulate according to time. It naturally supports index splitting friendly, and can split any time period according to time and data volume. ES provides Rollover Api + Index Template can realize index splitting very conveniently and friendly, and control the number of single index docs within 10 billion, that is, an index defaults to about 5 shards to ensure instant response to queries, but due to auditing The log has a lag. Here we do not use Rollover to implement it, but use the characteristics of the Index Template. As long as the index name meets the template matching rules, this template will be applied to create a new time series index.

5.2. Templates

1) Add es_time to automatically add time information every time ES is written:

PUT _ingest/pipeline/add_es_time
{
    
    
  "description": "add field es_time to doc",
  "processors": [
    {
    
    
      "set": {
    
    
        "field": "_source.es_time",
        "value": "{
    
    {_ingest.timestamp}}"
      }
    }
  ]
}

2) Add a template, make sure that es_time in step 1) has been set before adding a template

PUT _template/csbit_logs_audit_tradition_template
{
    
    
  "order": 0,
  "index_patterns": [
    "csbit_logs_audit_tradition_*"
  ],
  "settings": {
    
    
    "index": {
    
    
      "default_pipeline": "add_es_time",
      "number_of_replicas": 0,
      "number_of_shards": 5,
      "queries": {
    
    
        "cache": {
    
    
          "enabled": false
        }
      },
      "requests": {
    
    
        "cache": {
    
    
          "enable": false
        }
      }
    }
  },
  "mappings": {
    
    
    "_source": {
    
    
      "enabled": true
    },
    "dynamic": "strict",
    "properties": {
    
    
      "action": {
    
    
        "type": "integer"
      },
      "desc": {
    
    
        "type": "keyword"
      },
      "es_time": {
    
    
        "type": "date"
      },
      "level": {
    
    
        "type": "integer"
      },
      "server": {
    
    
        "properties": {
    
    
          "ip": {
    
    
            "type": "ip"
          },
          "port": {
    
    
            "index": false,
            "type": "integer"
          }
        }
      },
      "tm": {
    
    
        "type": "date"
      },
      "web": {
    
    
        "properties": {
    
    
          "ip": {
    
    
            "type": "ip"
          },
          "param": {
    
    
            "type": "text"
          },
          "url": {
    
    
            "type": "keyword"
          },
          "user": {
    
    
            "type": "keyword"
          }
        }
      }
    }
  }
}
  • According to actual needs, configure settings, refer to document life cycle, partial configuration, specify mapping and setting
  • The index name must be like csbit_logs_audit_tradition_yyyymmdd_000001

5.3. Segmentation

  • The segmentation of the index can be done according to the index name specification, because Rollover is not used

5.4. Use

  • Because multiple indexes are divided by time, you can query across multiple indexes when querying. There is no difference between scoring, sorting, paging and searching a single index.

6. Index life cycle

Since our index is used to store various log data, the log storage has a certain timeliness (at least 180 days is required for the guarantee); therefore, we will automatically delete the log data according to this feature, so as to ensure that the index file will not be overdue If the number is too large, the timeliness of log data query can also be improved;

6.1. Planning

The index life cycle is divided into four stages: HOT>WARM>COLD>DELETE. Except HOT is a required stage, other stages are not required and can be configured arbitrarily. Because the log index only needs to meet the automatic deletion function, we only need to plan the three stages of HOT, Warm and DELETE of the log index;

  • HOT stage

    The HOT stage is used to write log data and query log data. The plan for this stage will not be implemented using the policy rules that come with ES. Because the ES built-in policy index name will automatically perform the Rollover action, and Rollover will mark other indexes associated with the alias as non-writable actions; but in actual use, due to data delays, it is very likely that the previous day’s data will be written to the previous split indexing.

  • WARM stage

    The WARM stage is used to store relative historical log data and query log data, which can be migrated manually or automatically: shell script or curator regularly performs hot to warm actions

  • DELETE stage

    The DELETE stage is used to delete the log data. If the log data meets the deletion conditions of the DELETE stage (for example: index data exceeding 180 days), you can configure related policies and delete the index data manually or automatically by scripts.

6.2. Implementation

  • Operate the YML file of the ES node configuration and add the following configuration to specify the node as the cold and warm stage
    • node.attr.hotwarm_type: hot # Identify as hot data node Hot
    • node.attr.hotwarm_type: warm # Identified as a warm data node Warm
    • example:

hot node

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-gOBdBK8E-1673322267904)(RackMultipart20230110-1-yu4z31_html_9788f368a242bbd0.gif)]

warm node

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-gfhQdSQc-1673322267906)(RackMultipart20230110-1-yu4z31_html_f318e249bfad7237.gif)]

6.3. Data writing

When creating a template or indexing, specify and set index.routing.allocation.require.hotwarm_type as the hot type, and the data will be written to the hot node

6.3.1. Option 1: Designate hot and cold data nodes through templates (used by default)

  • Note: Index names starting with [order_] will put their data on the hot node

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-M8W5g6kZ-1673322267906)(RackMultipart20230110-1-yu4z31_html_abd16e3b4380b882.gif)]

6.3.2. Scheme 2: Specify hot and cold data nodes through indexes

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-XDnvYwox-1673322267907)(RackMultipart20230110-1-yu4z31_html_18b78e07aa88c8.gif)]

6.4. Data migration to cold nodes

6.4.1. Manual Migration

  • Operate in kibana:

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-umvSZFya-1673322267908)(RackMultipart20230110-1-yu4z31_html_e1398925dee0cad2.gif)]

6.4.2. Automatic migration

  • It can also be migrated regularly through shell script or curator. Because the real-time performance of the written data in this project is low, it can be used according to the actual situation
  • Here is the shell script solution:

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-XbDWLtIw-1673322267915)(RackMultipart20230110-1-yu4z31_html_a499f8647fb68d0e.gif)]

7.es configure distribution sharding rules

  • cluster.routing.allocation.awareness.attributes
    is based on a certain attribute as a shard distribution rule.

  • #Set node attribute rack_id and attribute value rack_one

    node.rack_id: rack_one

  • #Set the rack_id attribute as the shard distribution rule

    cluster.routing.allocation.awareness.attributes: rack_id

  • Multiple properties can be set for shard distribution rules, for example:

    cluster.routing.allocation.awareness.attributes: rack_id,zone

  • Note : When the shard distribution attribute is set, if a node in the cluster does not set any of these attributes, the shard will not be distributed to this node.

Guess you like

Origin blog.csdn.net/weixin_43480441/article/details/128627260