ELK Stack log platform performance optimization

Reprinted from:
https://mp.weixin.qq.com/s?__biz=MzAwNTM5Njk3Mw==&mid=2247487789&idx=1&sn=def0d8c2e6b8c596b4836a1180c7b221&chksm=9b1c11afac6b98b9ca74aa005e21ef25c8e999713e94c4e58de49a0c54d44e5e1619aa830c8d&mpshare=1&scene=1&srcid=1231F6Z3HevOthNAGOB0LTBJ&sharer_sharetime=1577767082607&sharer_shareid=6ec87ec9a11a0c18d61cde7663a9ef87#rd

1. Performance analysis

2. The choice of collecting logs: logstash / filter

3. Logstash optimization related configuration

4. Issues related to the introduction of Redis

5. Optimized configuration of Elasticsearch nodes

6. Performance check

1. Performance analysis

Server hardware Linux: **** 1cpu4GRAM

Assume that each log is 250Byte.

analysis:

①logstash-Linux:1cpu 4GRAM

  • 500 logs per second;
  • Remove ruby ​​660 logs per second;
  • 1000 data per second after removing grok.

②filebeat-Linux:1cpu 4GRAM

  • 2500-3500 data per second;
  • Each machine can process every day: 24h 60min 60sec * 3000 * 250Byte = 64,800,000,000Bytes, about 64G.

③ The bottleneck fetches data from Redis in Logstash and stores it in ES, opens a logstash and processes about 6000 data per second; opens two logstash and processes about 10,000 data per second (cpu has basically run out)

The startup process of logstash takes up a lot of system resources, because java, ruby ​​and other environment variables are checked in the script, and the resource occupancy will return to normal state after startup.

2. The choice of collecting logs: **** logstash / filter

There is no principle requiring the use of filebeat or logstash, both of which have the same function as shipper.

The difference is that:

  • Because logstash integrates many plug-ins, such as grok and ruby, it is a heavyweight compared to beat;
  • Logstash takes up more resources after starting, if the hardware resources are sufficient, there is no need to consider the difference between the two;
  • logstash is based on JVM and supports cross-platform; while beat is written in golang, AIX does not support it;
  • AIX 64bit platform needs to install jdk (jre) 1.7 32bit, 64bit is not supported;
  • Filebeat can be directly input to ES, but logstash is directly input to ES in the system, which will cause different index types and complicated retrieval.

to sum up:

In short, logstash / filter has its own advantages, but I recommend choosing: configure filebeat on each log server that needs to be collected, because it is lightweight, used to collect logs; then output unified to logstash, do log processing; and finally output by logstash Give es.

3. Logstash optimization related configuration

The parameters that can be optimized can be optimized according to your own hardware:

①The number of pipeline threads, the official recommendation is equal to the number of CPU cores

  • Default configuration ---> pipeline.workers: 2;
  • Can be optimized as ---> pipeline.workers: CPU core number (or several times the number of CPU cores).

② Number of threads in actual output

  • Default configuration ---> pipeline.output.workers: 1;
  • Can be optimized to ---> pipeline.output.workers: No more than pipeline threads.

③Number of events sent each time

  • Default configuration ---> pipeline.batch.size: 125;
  • Can be optimized to ---> pipeline.batch.size: 1000.

④ Send delay

  • Default configuration ---> pipeline.batch.delay: 5;
  • Can be optimized to ---> pipeline.batch.size: 10.

to sum up:

  • By setting the -w parameter to specify the number of pipeline workers, you can also directly modify the configuration file logstash.yml. This will increase the number of filters and output threads, if necessary, it is safe to set it to several times the number of CPU cores, the thread is idle on I / O.
  • By default, each output is active on a pipeline worker thread. You can set the workers setting in the output output. Do not set this value greater than the number of pipeline workers.
  • You can also set the batch_size number of the output, for example, the ES output is consistent with the batch size.
  • After the filter is set to multiline, the pipline worker will automatically be 1. If you use filebeat, it is recommended to use multiline in beat. If you use logstash as the shipper, it is recommended to set multiline in input, not multiline in filter.

JVM configuration file in Logstash:

Logstash is a Java-based program that needs to run in the JVM, and can be set for the JVM by configuring jvm.options. For example, the maximum and minimum memory, garbage cleaning mechanism, and so on. The memory allocation of the JVM cannot be too large or too small, too large will slow down the operating system. Too small to start. The default is as follows:

  • Xms256m # minimum used memory;
  • Xmx1g # maximum use of memory.

4. Issues related to the introduction of Redis

Filebeat can be directly input to logstash (indexer), but logstash does not have a storage function. If you need to restart, you need to stop all connected beats first, and then stop logstash, causing troubles in operation and maintenance; in addition, if logstash is abnormal, you will lose data; introduce Redis Data buffer pool, when logstash stops abnormally, you can see the data cached in Redis from the Redis client;

Redis can use list (support up to 4,294,967,295) or publish and subscribe storage mode;

Redis optimizes the ELK buffer queue:

  • bind 0.0.0.0 #Do not listen to the local port;

  • requirepass ilinux.io #Add a password for safe operation;

  • Only do queues, no need for persistent storage, turn off all persistence functions:

    Snapshot (RDB file) and appended file (AOF file), the performance is better;

    save "" disable snapshot;

    appendonly no Turn off the RDB.

  • Turn off the memory elimination strategy to maximize the memory space

    maxmemory 0 #maxmemory is 0, indicating that we have no restrictions on Redis memory usage.

5. Optimized configuration of Elasticsearch nodes

Server hardware configuration, OS parameters:

1) /etc/sysctl.conf configuration

# vim /etc/sysctl.confvm.swappiness = 1   #ES 推荐将此参数设置为 1,大幅降低 swap 分区的大小,强制最大程度的使用内存,注意,这里不要设置为 0, 这会很可能会造成 OOMnet.core.somaxconn = 65535     #定义了每个端口最大的监听队列的长度vm.max_map_count= 262144    #限制一个进程可以拥有的VMA(虚拟内存区域)的数量。虚拟内存区域是一个连续的虚拟地址空间区域。当VMA 的数量超过这个值,OOMfs.file-max = 518144    #设置 Linux 内核分配的文件句柄的最大数量# sysctl -p    #生效一下

2) Limits.conf configuration

# vim /etc/security/limits.confelasticsearch    soft    nofile          65535elasticsearch    hard    nofile          65535elasticsearch    soft    memlock         unlimitedelasticsearch    hard    memlock         unlimited

3) In order to make the above parameters permanently effective, there are two places to be set:

# vim /etc/pam.d/common-session-noninteractive# vim /etc/pam.d/common-session

Add the following attributes:

session required pam_limits.so

It may take effect after restarting.

JVM configuration file in Elasticsearch:

-Xms2g

-Xmx2g

  • Set the minimum heap size (Xms) and maximum heap size (Xmx) equal to each other.
  • The more heaps available to Elasticsearch, the more memory available for caching. However, please note that too much heap may pause your garbage collection for a long time.
  • Set Xmx to no more than 50% of physical RAM to ensure that enough physical memory is reserved for the kernel file system cache.
  • Do not set Xmx above the critical value that the JVM uses to compress object pointers; the exact cutoff value is different, but close to 32 GB. Do not exceed 32G, if the space is large, run a few more instances, and do not let one instance be too large.

Elasticsearch configuration file optimization parameters:

1) Main configuration file

# vim elasticsearch.ymlbootstrap.memory_lock: true  #锁住内存,不使用swap#缓存、线程等优化如下bootstrap.mlockall: truetransport.tcp.compress: trueindices.fielddata.cache.size: 40%indices.cache.filter.size: 30%indices.cache.filter.terms.size: 1024mbthreadpool:    search:        type: cached        size: 100        queue_size: 2000

2) Set environment variables

# vim /etc/profile.d/elasticsearch.sh export ES_HE AP _SIZE=2g #Heap Size不超过物理内存的一半,且小于32G。

Optimization of clusters (I did not use clusters):

  • ES is distributed storage, when the same cluster.name is set, it will be automatically discovered and joined to the cluster;
  • The cluster will automatically elect a master, and re-elect when the master is down;
  • In order to prevent "brain split", the number in the cluster is preferably an odd number;
  • To effectively manage nodes, you can turn off broadcast discovery. Zen.ping.multicast.enabled: false, and set the unicast node group discovery.zen.ping.unicast.hosts: ["ip1", "ip2", "ip3"].

6. Performance check

Check input and output performance:

Logstash and its connected service run at the same speed, it can be as fast as input and output.

Check system parameters:

1)CPU

  • Note whether the CPU is overloaded. On Linux / Unix systems, you can use top-H to view process parameters and totals.
  • If the CPU usage is too high, skip directly to the chapter on checking the JVM heap and check the Logstash worker settings.

2)Memory

  • Note that Logstash runs in the Java virtual machine, so it will only use the maximum memory you allocated to it.
  • Check that other applications use a lot of memory, which will cause Logstash to use hard disk swap. This situation will occur when the memory occupied by the application exceeds the physical memory range.

3) I / O monitoring disk I / O check disk saturation

  • Using the Logstash plugin (for example, using file output) the disk will saturate.
  • When a large number of errors occur, the disk will also be saturated when Logstash generates a large number of error logs.
  • In Linux, you can use iostat, dstat, or other commands to monitor disk I / O.

4) Monitor network I / O

  • When using a lot of input and output for network operation, it will cause network saturation.
  • You can use dstat or iftop to monitor network conditions in Linux.

Check the JVM heap:

  • If the heap setting is too small, the CPU usage will be too high, which is caused by the garbage collection mechanism of the JVM.
  • A quick way to check this setting is to set the heap to twice the size and then detect performance improvements. Do not set the heap to exceed the physical memory size, and reserve at least 1G of memory for the operating system and other processes.
  • You can use a command line like jmap or VisualVM to calculate the JVM heap more accurately.

Guess you like

Origin www.cnblogs.com/sanduzxcvbnm/p/12705807.html