Zabbix monitors java process memory usage

 Requirement: Monitor the memory usage of all Java processes in the cluster.

Check which java processes are running in the linux system: jps command

[root@localhost zabbix]# jps
26490 YarnTaskExecutorRunner
12012 NodeManager
14047 YarnTaskExecutorRunner
25007 Jps

View the memory usage of the java process: jstat command -gc -gcutil

[root@node035 zabbix]# jstat -gc 12012
 S0C    S1C    S0U    S1U      EC       EU        OC         OU       MC     MU    CCSC   CCSU   YGC     YGCT    FGC    FGCT     GCT   
2560.0 2560.0  0.0   2208.0 335872.0 180374.6  338432.0   57522.0   51624.0 50525.8 5808.0 5542.1 104079  881.980   3      0.384  882.364
[root@node035 zabbix]# jstat -gcutil 12012
  S0     S1     E      O      M     CCS    YGC     YGCT    FGC    FGCT     GCT   
  0.00  86.25  89.43  17.00  97.87  95.42 104079  881.980     3    0.384  882.364

 ####################################################################### 

Collect data script:

        Here it is best to use the grep command to filter out all the java processes you want to monitor, do not use the grep -v exclusion method

        Because some processes may go from generation to destruction faster than you use the jps command, such as jps, jstat, jmap and other commands, so you may use jps | grep to obtain only one process pid number, but not the process name and other information.

        It will cause your script to freeze occasionally, and then there will be a series of problems in data acquisition

        For the situation that there may be multiple processes with the same process name, such as parent and child processes, etc.

        The script here uses the method of marking the same process name, such as kafka, kafka1, kafka2....

        Because zabbix uses the automatic discovery method to obtain the process name, I have tried to use the method of process name + pid to obtain it, but the pid will change.

        So there is currently no good way to separate two methods with the same process name

        However, the significance of monitoring is to observe the trend changes of monitoring items. If you see an abnormal memory status of a process, go to the data file of our script to get the pid according to the process name

[root@node031 monitor]# cat getJavaMemoryStatus.sh
#!/bin/bash
# Final output
output=""

# Variables
flag=1
last_name=""
currnet_name=""	

# JPS Command
result=`/usr/local/jdk/bin/jps | egrep  "QuorumPeerMain|Kafka|CanalAdminApplication|CanalLauncher|JournalNode|DFSZKFailoverController|NameNode|DataNode|ResourceManager|NodeManager|YarnJobClusterEntrypoint|YarnTaskExecutorRunner|HMaster|HRegion" | sort -k2 -k1`

# Main Loop
#echo "$result" | while read -r pid name ; do
while read -r pid name ; do
    #echo "${pid},${name},${last_name}"

	# Add num to same process name, for example: Process1, Process2 ...
	if [ x"$name" = x"$last_name" ]; then
		currnet_name="$name$flag"
		flag=$(( $flag + 1 ))	
	else
	    currnet_name="$name"	
		flag=1
	fi
	last_name="$name"

	# Get GC Status
	res_gc=`/usr/local/jdk/bin/jstat -gc $pid 2>/dev/null | awk 'NR==2{print $1, $2, $3, $4, $5, $6, $7, $8}'`
	res_gcutil=`/usr/local/jdk/bin/jstat -gcutil $pid 2>/dev/null | awk 'NR==2{print $1, $2, $3, $4, $7, $8, $9, $10}'`

	# Combime output
	if [ x"$output" = x"" ]; then 
		output="${currnet_name} $pid ${res_gc} ${res_gcutil}"
	else
		output+=$'\n'"${currnet_name} $pid ${res_gc} ${res_gcutil}"
	fi
	#echo "$output"

done <<< "$result"

# Output
echo "$output" > /tmp/java_memory_status.txt

Script optimization:

        Get the data command to run only once as much as possible to reduce server pressure

        Try not to read files when fetching data to reduce IO

Process auto-discovery script:

[root@localhost parameter_script]# cat java_discovery.sh 
#!/bin/bash
javaProcessList=`cat /tmp/java_memory_status.txt|awk '{print $2"#"$1}'`
echo "{\"data\":["
first=1
for javaProcess in $javaProcessList;
do
    IFS='#' read -r -a items <<< "$javaProcess";
    if [ $first == 1 ]; then
        echo "{\"{#JAVAPSNAME}\":\"${items[1]}\",\"{#JAVAPSPID}\":\"${items[0]}\"}";
        first=0
    else
        echo ",{\"{#JAVAPSNAME}\":\"${items[1]}\",\"{#JAVAPSPID}\":\"${items[0]}\"}";
    fi
done;

echo "]}";

####################################################################### 

Get java process memory data script:

[root@node031 parameter_script]# cat getjavastatus.sh 
#!/bin/bash
pid=`cat /tmp/java_memory_status.txt | awk '{print $2}'`
case $2 in
# S0总大小
S0C)
	grep -w $1 /tmp/java_memory_status.txt |awk '{print $3}'|bc
	;;
# S1总大小
S1C)
	grep -w $1 /tmp/java_memory_status.txt |awk '{print $4}'|bc
	;;
# S0使用大小
S0U)
	grep -w $1 /tmp/java_memory_status.txt |awk '{print $5}'|bc
	;;
# S1使用大小
S1U)
	grep -w $1 /tmp/java_memory_status.txt |awk '{print $6}'|bc
	;;
# Eden总大小
EC)
	grep -w $1 /tmp/java_memory_status.txt |awk '{print $7}'|bc
	;;
# Eden使用大小
EU)
	grep -w $1 /tmp/java_memory_status.txt |awk '{print $8}'|bc
	;;
#old大小
OC)
	grep -w $1 /tmp/java_memory_status.txt |awk '{print $9}'|bc
	;;
#old使用大小
OU)
	grep -w $1 /tmp/java_memory_status.txt |awk '{print $10}'|bc
	;;
# S0使用率
S0Util)
	grep -w $1 /tmp/java_memory_status.txt |awk '{print $11}'|bc
	;;
# S1使用率
S1Util)
	grep -w $1 /tmp/java_memory_status.txt |awk '{print $12}'|bc
	;;
# Eden使用率
EUtil)
	grep -w $1 /tmp/java_memory_status.txt |awk '{print $13}'|bc
	;;
#old使用率
OUtil)
	grep -w $1 /tmp/java_memory_status.txt |awk '{print $14}'|bc
	;;
# 年轻代垃圾回收次数
YGC)
	grep -w $1 /tmp/java_memory_status.txt |awk '{print $15}'|bc
	;;
# 年轻代垃圾回收消耗时间
YGCT)
	grep -w $1 /tmp/java_memory_status.txt |awk '{print $16}'|bc
	;;
# 老年代垃圾回收次数
FGC)
	grep -w $1 /tmp/java_memory_status.txt |awk '{print $17}'|bc
	;;
# 老年代垃圾回收消耗时间
FGCT)
	grep -w $1 /tmp/java_memory_status.txt |awk '{print $18}'|bc
	;;
esac	

Add configuration files and customize monitoring items

UserParameter=javaps,/etc/zabbix/parameter_script/java_discovery.sh
UserParameter=javastat[*],/etc/zabbix/parameter_script/getjavastatus.sh $1 $2

Restart the zabbix-agent2 process

service zabbix-agent2 restart

Configure scheduled tasks

*/1 * * * * sh /data/script/monitor/getJavaMemoryStatus.sh

Configure java process automatic discovery

Create a template group: JavaProcess

 Create a template JavaProcess

Create auto-discovery rules in the JavaProcess template

 

 Add the prototype of the monitoring item to be monitored

 Add a JavaProcess template to the host to be monitored

 

Zabbix will automatically add the discovered process to the corresponding host, and then create the corresponding monitoring item according to the prototype of the monitoring item. After collecting the data, grafana will generate a graph

Guess you like

Origin blog.csdn.net/qq_48391148/article/details/129716562