Elasticsearch write performance optimization

Performance Testing

  • One shard on one node, no replica set, to test performance
  • Record performance data on full default settings as a baseline for testing
  • Make sure the performance test lasts for more than 30 minutes to confirm long-term performance; short-term tests may not encounter segment merge and GC, and cannot confirm the impact of these factors
  • Change one parameter at a time based on the default baseline, keep the setting if the performance improves, and do subsequent tests based on this setting

Bulk usage suggestion

  • The recommended size of each request is 5-15MB, and gradually increase the test. When an EsRejectedExecutionException is received, it means that the bottleneck of the node has been reached, and it is necessary to reduce concurrency or upgrade hardware to increase the node.
  • When writing data, make sure to poll all nodes for bulk requests, do not send all requests to a node that will cause this node to store all requested data in memory for processing

Optimize disk IO

  • Use SSD
  • Use RAID 0, no mirror backup, use replicas to ensure data correctness, increase disk IO
  • Use multiple disks for Elasticsearch access by adding in path.data
  • Do not use remote storage, such as NFS/SMB/CIFS; latency will become a performance bottleneck

segment merge

Segment merging is an operation that consumes a lot of computing resources and disk IO, especially when relatively large segment merging occurs. 
When the speed of segment merging lags behind the speed of index writing, Elasticsearch will reduce the index writing speed of a single thread in order to avoid the explosion of the number of accumulated segments, and will record "now throttling indexing" in the INFO log.

Elasticsearch is conservative by default, and does not want the performance of the search to be affected by the background segment merging. The default segment merging rate limit is relatively low, the default is 20MB/s, but if you are using SSD, you can consider setting this parameter to 100-200MB/s s

PUT /_cluster/settings
{
    "persistent" : {
        "indices.store.throttle.max_bytes_per_sec" : "100mb"
    }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6

If you are just importing data with bulk and don't care about query performance, you can turn off the merge threshold

PUT /_cluster/settings
{
    "transient" : {
        "indices.store.throttle.type" : "none" 
    }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6

Then revert to "merge" after importing the data to restore this threshold setting

If it is a mechanical hard disk, you need to add the following configuration to elasticsearch.yml

index.merge.scheduler.max_thread_count: 1
  • 1
  • 1

机械硬盘的并发IO性能较差,我们需要减少每个索引并发访问磁盘的线程数,这个设置会有max_thread_count+2个线程并发访问磁盘 
如果是SSD可以忽略这个参数,默认线程数是Math.min(3, Runtime.getRuntime().availableProcessors() / 2),对于SSD来说没有问题。

可以增大index.translog.flush_threshold_size参数,默认是200M,可以增大到如1GB。增大这个参数可以允许translog在flush前存放更大的段(segment);更大的段的创建会减少flush的频率,并且更大的段合并越少,会减少磁盘IO,索引性能更高。

其他优化

  • 如果不需要实时精确的查询结果,可以把每个索引的index.refresh_interval设置为30s,如果在导入大量的数据,可以把这个值先设置为-1,完成数据导入之后在设置回来
  • 如果在用bulk导入大量的数据,可以考虑不要副本,设置index.number_of_replicas: 0。有副本存在的时候,导入数据需要同步到副本,并且副本也要完成分析,索引和段合并的操作,影响导入性能。可以不设置副本导入数据然后在恢复副本。
  • 如果导入的文档没有唯一的ID,可以使用Elasticsearch自动生成的唯一ID

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326611841&siteId=291194637