The client node in the ElasticSearch cluster fails to ping and cannot be accessed

[Problem phenomenon]
After accessing the client node in the ElasticSearch cluster for a few days, the ping of the host where the client is located fails to ping. Through the vm management machine, it is found that the host is running but the screen is black, and the login cannot be accessed.
[Problem Analysis]
1. The ping of the host fails. At the beginning, I considered whether the network was unreachable or the host was down. Through the vm management connection, the login cannot be entered, and the network reason is excluded, because the vm management machine does not need the network to log in to the vm

2. Configure the core file generation
=> ulimit -c => limit the size of the generated core file => unlimited
=> /proc/sys/ kernel/core_pattern => core file format
=> /proc/sys/kernel/core_uses_pid => whether with pid
3. After opening the core log, it is found that the core file is also empty
/////////begin//// ///
#find / -name *.out
/etc/rc.d/init.d/a.out
# ll /etc/rc.d/init.d/a.out
-rw-r--r-- . 1 root root 0 September 2 10:29 /etc/rc.d/init.d/a.out
////////end/////////

4. Check lsof -p <es_id> |wc -l => The number is normal, not large

5. Check lsof |wc -l => The number is very large
# lsof |wc -l
362965

6. Analyze again and check the number of file handles opened by the user running es
# lsof |grep user_es |wc -l
360197

7. The soft limit was previously configured
# cat /etc/security/limits.conf
user_es soft memlock unlimited
user_es hard memlock unlimited

[Solution]
Through the above analysis, it can be basically inferred that the reason is that the number of file handles opened by elasticsearch is increasing,
resulting in the exhaustion of operating system resources and the inability to open new file handles. In this case, login, ping, etc.
At present , first restart the node where the client is located by timing, and configure the timing command in crontab [0 2 * * * reboot]

[Tip]
Frequent document modification operations will lead to a large number of small index segments, which will cause the file handle to be opened too much. many questions.
Pay attention to the segment merging strategy of ElasticSearch, the default is index.merge.policy.type:tiered, the optional strategies are: log_byte_size, log_doc

[Reference]
https://www.elastic.co/guide/en/elasticsearch/guide/current /merge-process.html

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326285535&siteId=291194637